Query Syntax

This page describes Verdict’s query syntax. In general, Verdict follows the syntax of the standard SQL. There are some temporary limitations (highlighted using blue boxes below), which will be gradually lifted in future versions.

Query

The Verdict query must be an aggregate query with optional groupby, orderby, and limit clauses.

query := SELECT aggregate_function, ...
         FROM relation
         [GROUP BY base_attr, ...]
         [ORDER BY alias, ...]
         [LIMIT int]

alias := str

base_attr := str

Example:

SELECT sum(price) as price_sum, count(*) as c
FROM hive.tpch_sf100.lineitem_premerged
GROUP BY l_linestatus
ORDER BY price_sum
LIMIT 5

Note

If the groupby clause is present, the grouping columns will be prepended in the result set. In the future, this behavior will be changed to follow the standard SQL semantics.

Relation

A relation can be a base table, joins of relations, or subqueries.

relation := base_table |
            relation join_expr |
            (SELECT attr_alias, ...
             FROM relation
             [WHERE predicate]
             [GROUP BY attr, ...]) alias

join_expr := join_type relation ON base_attr = base_attr

join_type := INNER JOIN |
             LEFT JOIN |
             RIGHT JOIN |
             OUTER JOIN

attr_alias := attr [AS] alias

Examples of the relation:

-- example 1
(
    select
        l_returnflag,
        l_quantity,
        l_extendedprice,
        l_discount,
        l_extendedprice,
        l_extendedprice * (1 - l_discount) disc_price,
        l_extendedprice * (1 - l_discount) * (1 + l_tax) charge,
        l_returnflag,
        l_linestatus
    from
        hive.tpch_tiny.lineitem_premerged
    where
        l_shipdate <= date '1998-12-01'
) t1

-- example 2
(
    select
        l_orderkey,
        l_extendedprice * (1 - l_discount) revenue,
        o_orderdate,
        o_shippriority
    from
        hive.tpch_tiny.lineitem_premerged l
            inner join hive.tpch_tiny.orders_premerged o
            on l_orderkey = o_orderkey
    where
        c_mktsegment = 'BUILDING'
        and o_orderdate < date '1995-03-22'
        and l_shipdate > date '1995-03-22'
) t1

Note that in the above example, t1 is the alias of the subquery relation. If the alias is omitted the same name is assigned for base attributes and an arbitrary name is assigned for derived attributes (e.g., l_extendedprice * (1 - l_discount)).

Note

The join type must be equijoin (whether it be inner, left, or right). The attribute that appears on the left-hand side of the equality sign is assumed to the attribute in the left join table. The similar rule applies for the right attribute.

Attribute

An attribute can be a base attribute or some functions of it.

attr := base_attr |
        constant |
        scalar_function |
        aggregate_function

constant := int |
            str |
            date '0000-00-00' |
            timestamp '0000-00-00 00:00:00'

predicate := logical_expr |
             comparative_expr

logical_expr := predicate AND predicate |
                predicate OR predicate  |
                NOT predicate

comparative_expr := attr > attr |
                    attr < attr |
                    attr >= attr |
                    attr <= attr |
                    attr <> attr |
                    attr in [ constant, ... ]

We describe more details about functions in the subsequent sections.

Scalar Functions

A scalar function is the function that produces an output value for each input value.

scalar_function := math_function |
                   string_function

Mathematical Functions

math_function := attr + attr |
                 attr - attr |
                 attr * attr |
                 attr / attr |
                 floor(attr) |
                 ceil(attr)  |
                 round(attr)

String Functions

string_function := SUBSTR(attr, start, length) |
                   TO_STRING(attr) |
                   CAST(attr AS VARCHAR) |
                   CONCAT(attr, attr) |
                   LENGTH(attr) |
                   REPLACE(old, new) |
                   UPPER(attr) |
                   LOWER(attr) |
                   STARTSWITH(attr, pattern) |
                   CONTAINS(attr, pattern) |
                   ENDSWITH(attr, pattern)

Note

We are adding more scalar functions.

Aggregate Functions

An aggregate function is a function that produces a single row given multiple rows.

aggregate_function := COUNT(*) |
                      SUM(base_attr) |
                      AVG(base_attr)

Note

To use a derived attribute within aggregate functions, you can first create new attributes using subqueries, then attribute those new attributes.