Commands

SELECT
UPSERT VALUES
UPSERT SELECT
DELETE
CREATE TABLE
DROP TABLE
CREATE FUNCTION
DROP FUNCTION
CREATE VIEW
DROP VIEW
CREATE SEQUENCE
DROP SEQUENCE
ALTER
CREATE INDEX
DROP INDEX
ALTER INDEX
EXPLAIN
UPDATE STATISTICS

Other Grammar

Constraint
Options
Hint
Scan Hint
Cache Hint
Index Hint
Small Hint
Seek To Column Hint
Join Hint
Serial Hint
Column Def
Table Ref
Sequence Ref
Column Ref
Select Expression
Select Statement
Split Point
Table Spec
Aliased Table Ref
Join Type
Func Argument
Class Name
Jar Path
Order
Expression
And Condition
Boolean Condition
Condition
RHS Operand
Operand
Summand
Factor
Term
Array Constructor
Sequence
Cast
Row Value Constructor
Bind Parameter
Value
Case
Case When
Name
Quoted Name
Alias
Null
Data Type
SQL Data Type
HBase Data Type
String
Boolean
Numeric
Int
Long
Decimal
Number
Comments

SELECT

selectStatement [ { UNION ALL selectStatement [...] } ]
[ ORDER BY order [,...] ] [ LIMIT {bindParameter | number} ]
selectStatement
 
UNION ALL selectStatement
 
...

 
ORDER BY order
 
, ...
 
LIMIT
bindParameter
number

Selects data from one or more tables. UNION ALL combines rows from multiple select statements. ORDER BY sorts the result based on the given expressions. LIMIT limits the number of rows returned by the query with no limit applied if unspecified or specified as null or less than zero. The LIMIT clause is executed after the ORDER BY clause to support top-N type queries. An optional hint may be used to override decisions made by the query optimizer.

Example:

SELECT * FROM TEST LIMIT 1000;
SELECT full_name FROM SALES_PERSON WHERE ranking >= 5.0
    UNION ALL SELECT reviewer_name FROM CUSTOMER_REVIEW WHERE score >= 8.0

UPSERT VALUES

UPSERT INTO tableName [( { columnRef | columnDef } [,...] )] VALUES ( constantTerm [,...] )
UPSERT INTO tableName
 
(
columnRef
columnDef
 
, ...
)
VALUES ( constantTerm
 
, ...
)

Inserts if not present and updates otherwise the value in the table. The list of columns is optional and if not present, the values will map to the column in the order they are declared in the schema. The values must evaluate to constants.

Example:

UPSERT INTO TEST VALUES('foo','bar',3);
UPSERT INTO TEST(NAME,ID) VALUES('foo',123);

UPSERT SELECT

UPSERT [/*+ hint */] INTO tableName [( { columnRef | columnDef } [,...] )] select
UPSERT
 
/ * + hint * /
INTO tableName
 
(
columnRef
columnDef
 
, ...
)
select

Inserts if not present and updates otherwise rows in the table based on the results of running another query. The values are set based on their matching position between the source and target tables. The list of columns is optional and if not present will map to the column in the order they are declared in the schema. If auto commit is on, and both a) the target table matches the source table, and b) the select performs no aggregation, then the population of the target table will be done completely on the server-side (with constraint violations logged, but otherwise ignored). Otherwise, data is buffered on the client and, if auto commit is on, committed in row batches as specified by the UpsertBatchSize connection property (or the phoenix.mutate.upsertBatchSize HBase config property which defaults to 10000 rows)

Example:

UPSERT INTO test.targetTable(col1, col2) SELECT col3, col4 FROM test.sourceTable WHERE col5 < 100
UPSERT INTO foo SELECT * FROM bar;

DELETE

DELETE [/*+ hint */] FROM tableName [ WHERE expression ]
[ ORDER BY order [,...] ] [ LIMIT {bindParameter | number} ]
DELETE
 
/ * + hint * /
FROM tableName
 
WHERE expression

 
ORDER BY order
 
, ...
 
LIMIT
bindParameter
number

Deletes the rows selected by the where clause. If auto commit is on, the deletion is performed completely server-side.

Example:

DELETE FROM TEST;
DELETE FROM TEST WHERE ID=123;
DELETE FROM TEST WHERE NAME LIKE 'foo%';

CREATE TABLE

CREATE TABLE [IF NOT EXISTS] tableRef
( columnDef [,...] [constraint] )
[tableOptions] [ SPLIT ON ( splitPoint [,...] ) ]
CREATE TABLE
 
IF NOT EXISTS
tableRef

( columnDef
 
, ...
 
constraint
)

 
tableOptions
 
SPLIT ON ( splitPoint
 
, ...
)

Creates a new table. The HBase table and any column families referenced are created if they don't already exist. All table, column family and column names are uppercased unless they are double quoted in which case they are case sensitive. Column families that exist in the HBase table but are not listed are ignored. At create time, to improve query performance, an empty key value is added to the first column family of any existing rows or the default column family if no column families are explicitly defined. Upserts will also add this empty key value. This improves query performance by having a key value column we can guarantee always being there and thus minimizing the amount of data that must be projected and subsequently returned back to the client. HBase table and column configuration options may be passed through as key/value pairs to configure the HBase table as desired. Note that when using the IF NOT EXISTS clause, if a table already exists, then no change will be made to it. Additionally, no validation is done to check whether the existing table metadata matches the proposed table metadata. so it's better to use DROP TABLE followed by CREATE TABLE is the table metadata may be changing.

Example:

CREATE TABLE my_schema.my_table ( id BIGINT not null primary key, date)
CREATE TABLE my_table ( id INTEGER not null primary key desc, date DATE not null,
    m.db_utilization DECIMAL, i.db_utilization)
    m.DATA_BLOCK_ENCODING='DIFF'
CREATE TABLE stats.prod_metrics ( host char(50) not null, created_date date not null,
    txn_count bigint CONSTRAINT pk PRIMARY KEY (host, created_date) )
CREATE TABLE IF NOT EXISTS "my_case_sensitive_table"
    ( "id" char(10) not null primary key, "value" integer)
    DATA_BLOCK_ENCODING='NONE',VERSIONS=5,MAX_FILESIZE=2000000 split on (?, ?, ?)
CREATE TABLE IF NOT EXISTS my_schema.my_table (
    org_id CHAR(15), entity_id CHAR(15), payload binary(1000),
    CONSTRAINT pk PRIMARY KEY (org_id, entity_id) )
    TTL=86400

DROP TABLE

DROP TABLE [IF EXISTS] tableRef [CASCADE]
DROP TABLE
 
IF EXISTS
tableRef
 
CASCADE

Drops a table. The optional CASCADE keyword causes any views on the table to be dropped as well. When dropping a table, by default the underlying HBase data and index tables are dropped. The phoenix.schema.dropMetaData may be used to override this and keep the HBase table for point-in-time queries.

Example:

DROP TABLE my_schema.my_table;
DROP TABLE IF EXISTS my_table;
DROP TABLE my_schema.my_table CASCADE;

CREATE FUNCTION

CREATE [TEMPORARY] FUNCTION funcName
([funcArgument[,...]])
RETURNS dataType AS className [USING JAR jarPath]
CREATE
 
TEMPORARY
FUNCTION funcName

(
 
funcArgument
 
, ...
)

RETURNS dataType AS className
 
USING JAR jarPath

Creates a new function. The function name is uppercased unless they are double quoted in which case they are case sensitive. The function accepts zero or more arguments. The class name and jar path should be in single quotes. The jar path is optional and if not specified then the class name will be loaded from the jars present in directory configured for hbase.dynamic.jars.dir.

Example:

CREATE FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction' using jar 'hdfs:/localhost:8080/hbase/lib/myjar.jar'
CREATE FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction'
CREATE FUNCTION my_increment(integer, integer constant defaultvalue='10') returns integer as 'com.mypackage.MyIncrementFunction' using jar '/hbase/lib/myincrement.jar'
CREATE TEMPORARY FUNCTION my_reverse(varchar) returns varchar as 'com.mypackage.MyReverseFunction' using jar 'hdfs:/localhost:8080/hbase/lib/myjar.jar'

DROP FUNCTION

DROP FUNCTION [IF EXISTS] funcName
DROP FUNCTION
 
IF EXISTS
funcName

Drops a function.

Example:

DROP FUNCTION IF EXISTS my_reverse
DROP FUNCTION my_reverse

CREATE VIEW

CREATE VIEW [IF NOT EXISTS] newTableRef
[ ( columnDef [,...] ) ]
[AS SELECT * FROM existingTableRef [WHERE expression] ]
[tableOptions]
CREATE VIEW
 
IF NOT EXISTS
newTableRef

 
( columnDef
 
, ...
)

 
AS SELECT * FROM existingTableRef
 
WHERE expression

 
tableOptions

Creates a new view over an existing HBase or Phoenix table. As expected, the WHERE expression is always automatically applied to any query run against the view. As with CREATE TABLE, the table, column family, and column names are uppercased unless they are double quoted. The newTableRef may refer directly to an HBase table, in which case, the table, column family, and column names must match the existing metadata exactly or an exception will occur. When a view is mapped directly to an HBase table, no empty key value will be added to rows and the view will be read-only. A view will be updatable (i.e. referenceable in a DML statement such as UPSERT or DELETE) if its WHERE clause expression contains only simple equality expressions separated by ANDs. Updatable views are not required to set the columns which appear in the equality expressions, as the equality expressions define the default values for those columns. If they are set, then they must match the value used in the WHERE clause, or an error will occur. All columns from the existingTableRef are included as columns in the new view as are columns defined in the columnDef list. An ALTER VIEW statement may be issued against a view to remove or add columns, however, no changes may be made to the primary key constraint. In addition, columns referenced in the WHERE clause are not allowed to be removed. Once a view is created for a table, that table may no longer altered or dropped until all of its views have been dropped.

Example:

CREATE VIEW "my_hbase_table"
    ( k VARCHAR primary key, "v" UNSIGNED_LONG) default_column_family='a';
CREATE VIEW my_view ( new_col SMALLINT )
    AS SELECT * FROM my_table WHERE k = 100;
CREATE VIEW my_view_on_view
    AS SELECT * FROM my_view WHERE new_col > 70;

DROP VIEW

DROP VIEW [IF EXISTS] tableRef [CASCADE]
DROP VIEW
 
IF EXISTS
tableRef
 
CASCADE

Drops a view. The optional CASCADE keyword causes any views derived from the view to be dropped as well. When dropping a view, the actual table data is not affected. However, index data for the view will be deleted.

Example:

DROP VIEW my_view
DROP VIEW IF EXISTS my_schema.my_view
DROP VIEW IF EXISTS my_schema.my_view CASCADE

CREATE SEQUENCE

CREATE SEQUENCE [IF NOT EXISTS] sequenceRef
[START [WITH] {number | bindParameter}] [INCREMENT [BY] {number | bindParameter}]
[MINVALUE {number | bindParameter}] [MAXVALUE {number | bindParameter}] [CYCLE]
[CACHE {number | bindParameter}]
CREATE SEQUENCE
 
IF NOT EXISTS
sequenceRef

 
START
 
WITH
number
bindParameter
 
INCREMENT
 
BY
number
bindParameter

 
MINVALUE
number
bindParameter
 
MAXVALUE
number
bindParameter
 
CYCLE

 
CACHE
number
bindParameter

Creates a monotonically increasing sequence. START controls the initial sequence value while INCREMENT controls by how much the sequence is incremented after each call to NEXT VALUE FOR. By default, the sequence will start with 1 and be incremented by 1. Specify CYCLE to indicate that the sequence should continue to generate values after reaching either its MINVALUE or MAXVALUE. After an ascending sequence reaches its MAXVALUE, it generates its MINVALUE. After a descending sequence reaches its MINVALUE, it generates its MAXVALUE. CACHE controls how many sequence values will be reserved from the server, cached on the client, and doled out as need by subsequent NEXT VALUE FOR calls for that client connection to the cluster to save on RPC calls. If not specified, the phoenix.sequence.cacheSize config parameter defaulting to 100 will be used for the CACHE value.

Example:

CREATE SEQUENCE my_sequence;
CREATE SEQUENCE my_sequence START WITH -1000
CREATE SEQUENCE my_sequence INCREMENT BY 10
CREATE SEQUENCE my_schema.my_sequence START 0 CACHE 10

DROP SEQUENCE

DROP SEQUENCE [IF EXISTS] sequenceRef
DROP SEQUENCE
 
IF EXISTS
sequenceRef

Drops a sequence.

Example:

DROP SEQUENCE my_sequence
DROP SEQUENCE IF EXISTS my_schema.my_sequence

ALTER

ALTER {TABLE | VIEW} tableRef { { ADD [IF NOT EXISTS] columnDef [,...] [options] } | { DROP COLUMN [IF EXISTS] columnRef [,...] } | { SET options } }
ALTER
TABLE
VIEW
tableRef
ADD
 
IF NOT EXISTS
columnDef
 
, ...
 
options
DROP COLUMN
 
IF EXISTS
columnRef
 
, ...
SET options

Alters an existing table by adding or removing columns or updating table options. When a column is dropped from a table, the data in that column is deleted as well. PK columns may not be dropped, and only nullable PK columns may be added. For a view, the data is not affected when a column is dropped. Note that creating or dropping columns only affects subsequent queries and data modifications. Snapshot queries that are connected at an earlier timestamp will still use the prior schema that was in place when the data was written.

Example:

ALTER TABLE my_schema.my_table ADD d.dept_id char(10) VERSIONS=10
ALTER TABLE my_table ADD dept_name char(50), parent_id char(15) null primary key
ALTER TABLE my_table DROP COLUMN d.dept_id, parent_id;
ALTER VIEW my_view DROP COLUMN new_col;
ALTER TABLE my_table SET IMMUTABLE_ROWS=true,DISABLE_WAL=true;

CREATE INDEX

CREATE INDEX [IF NOT EXISTS] indexName
ON tableRef ( expression [ASC | DESC] [,...] )
[ INCLUDE ( columnRef [,...] ) ]
[ ASYNC ]
[indexOptions] [ SPLIT ON ( splitPoint [,...] ) ]
CREATE INDEX
 
IF NOT EXISTS
indexName

ON tableRef ( expression
 
ASC
DESC
 
, ...
)

 
INCLUDE ( columnRef
 
, ...
)

 
ASYNC

 
indexOptions
 
SPLIT ON ( splitPoint
 
, ...
)

Creates a new secondary index on a table or view. The index will be automatically kept in sync with the table as the data changes. At query time, the optimizer will use the index if it contains all columns referenced in the query and produces the most efficient execution plan. If a table has rows that are write-once and append-only, then the table may set the IMMUTABLE_ROWS property to true (either up-front in the CREATE TABLE statement or afterwards in an ALTER TABLE statement). This reduces the overhead at write time to maintain the index. Otherwise, if this property is not set on the table, then incremental index maintenance will be performed on the server side when the data changes. As of the 4.3 release, functional indexes are supported which allow arbitrary expressions rather than solely column names to be indexed. As of the 4.4.0 release, you can specify the ASYNC keyword to create the index using a map reduce job.

Example:

CREATE INDEX my_idx ON sales.opportunity(last_updated_date DESC)
CREATE INDEX my_idx ON log.event(created_date DESC) INCLUDE (name, payload) SALT_BUCKETS=10
CREATE INDEX IF NOT EXISTS my_comp_idx ON server_metrics ( gc_time DESC, created_date DESC )
    DATA_BLOCK_ENCODING='NONE',VERSIONS=?,MAX_FILESIZE=2000000 split on (?, ?, ?)
CREATE INDEX my_idx ON sales.opportunity(UPPER(contact_name))

DROP INDEX

DROP INDEX [IF EXISTS] indexName ON tableRef
DROP INDEX
 
IF EXISTS
indexName ON tableRef

Drops an index from a table. When dropping an index, the data in the index is deleted. Note that since metadata is versioned, snapshot queries connecting at an earlier time stamp may still use the index, as the HBase table backing the index is not deleted.

Example:

DROP INDEX my_idx ON sales.opportunity
DROP INDEX IF EXISTS my_idx ON server_metrics

ALTER INDEX

ALTER INDEX [IF EXISTS] indexName ON tableRef { DISABLE | REBUILD | UNUSABLE | USABLE }
ALTER INDEX
 
IF EXISTS
indexName ON tableRef
DISABLE
REBUILD
UNUSABLE
USABLE

Alters the state of an existing index.  DISABLE will cause the no further index maintenance to be performed on the index and it will no longer be considered for use in queries. REBUILD will completely rebuild the index and upon completion will enable the index to be used in queries again. UNUSABLE will cause the index to no longer be considered for use in queries, however index maintenance will continue to be performed. USABLE will cause the index to again be considered for use in queries. Note that a disabled index must be rebuild and cannot be set as USABLE.

Example:

ALTER INDEX my_idx ON sales.opportunity DISABLE
ALTER INDEX IF EXISTS my_idx ON server_metrics REBUILD

EXPLAIN

EXPLAIN {select|upsertSelect|delete}

Computes the logical steps necessary to execute the given command. Each step is represented as a string in a single column result set row.

Example:

EXPLAIN SELECT NAME, COUNT(*) FROM TEST GROUP BY NAME HAVING COUNT(*) > 2;
EXPLAIN SELECT entity_id FROM CORE.CUSTOM_ENTITY_DATA WHERE organization_id='00D300000000XHP' AND SUBSTR(entity_id,1,3) = '002' AND created_date < CURRENT_DATE()-1;

UPDATE STATISTICS

UPDATE STATISTICS tableRef [ALL | INDEX | COLUMNS] [SET guidepostOptions]
UPDATE STATISTICS tableRef
 
ALL
INDEX
COLUMNS
 
SET guidepostOptions

Updates the statistics on the table and by default all of its associated index tables. To only update the table, use the COLUMNS option and to only update the INDEX, use the INDEX option. The statistics for a single index may also be updated by using its full index name for the tableRef. The default guidepost properties may be overridden by specifying their values after the SET keyword. Note that when a major compaction occurs, the default guidepost properties will be used again.

Example:

UPDATE STATISTICS my_table
UPDATE STATISTICS my_schema.my_table INDEX
UPDATE STATISTICS my_index
UPDATE STATISTICS my_table COLUMNS
UPDATE STATISTICS my_table SET phoenix.stats.guidepost.width=50000000

Constraint

CONSTRAINT constraintName PRIMARY KEY ( columnName
 
ASC
DESC
 
ROW_TIMESTAMP
 
, ...
)

Defines a multi-part primary key constraint. Each column may be declared to be sorted in ascending or descending ordering. The default is ascending. One primary key column can also be designated as ROW_TIMESTAMP provided it is of one of the types: BIGINT, UNSIGNED_LONG, DATE, TIME and TIMESTAMP.

Example:

CONSTRAINT my_pk PRIMARY KEY (host,created_date)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC)
CONSTRAINT my_pk PRIMARY KEY (host ASC,created_date DESC ROW_TIMESTAMP)

Options

 
familyName .
name = value
 
, ...

Sets a built-in Phoenix table property or an HBase table or column descriptor metadata attribute. The name is case insensitive. Built-in Phoenix table options include SALT_BUCKETS, DISABLE_WAL, IMMUTABLE_ROWS, MULTI_TENANT, DEFAULT_COLUMN_FAMILY, STORE_NULLS, TRANSACTIONAL, and UPDATE_CACHE_FREQUENCY. If the name is a known HColumnDescriptor attribute, then the value is applied to the specified column family or, if omitted, to all column families. Otherwise, the HBase metadata attribute value is applied to the HTableDescriptor. Note that no validation is performed on the property name or value, so unknown or misspelled options will end up as adhoc metadata attributes values on the HBase table.

The SALT_BUCKETS numeric property causes an extra byte to be transparently prepended to every row key to ensure an evenly distributed read and write load across all region servers. This is especially useful when your row key is always monotonically increasing and causing hot spotting on a single region server. However, even if it's not, it often improves performance by ensuring an even distribution of data across your cluster.  The byte is determined by hashing the row key and modding it with the SALT_BUCKETS value. The value may be from 0 to 256, with 0 being a special means of turning salting off for an index in which the data table is salted (since by default an index has the same number of salt buckets as its data table). If split points are not defined for the table, the table will automatically be pre-split at each possible salt bucket value. For more information, see http://phoenix.incubator.apache.org/salted.html

The DISABLE_WAL boolean option when true causes HBase not to write data to the write-ahead-log, thus making updates faster at the expense of potentially losing data in the event of a region server failure. This option is useful when updating a table which is not the source-of-truth and thus making the lose of data acceptable.

The IMMUTABLE_ROWS boolean option when true declares that your table has rows which are write-once, append-only (i.e. the same row is never updated). With this option set, indexes added to the table are managed completely on the client-side, with no need to perform incremental index maintenance, thus improving performance. Deletes of rows in immutable tables are allowed with some restrictions if there are indexes on the table. Namely, the WHERE clause may not filter on columns not contained by every index. Upserts are expected to never update an existing row (failure to follow this will result in invalid indexes). For more information, see http://phoenix.incubator.apache.org/secondary_indexing.html

The MULTI_TENANT boolean option when true enables views to be created over the table across different tenants. This option is useful to share the same physical HBase table across many different tenants. For more information, see http://phoenix.incubator.apache.org/multi-tenancy.html

The DEFAULT_COLUMN_FAMILY string option determines the column family used used when none is specified. The value is case sensitive. If this option is not present, a column family name of '0' is used.

The STORE_NULLS boolean option (available as of Phoenix 4.3) determines whether or not null values should be explicitly stored in HBase. This option is generally only useful if a table is configured to store multiple versions in order to facilitate doing flashback queries (i.e. queries to look at the state of a record in the past).

The TRANSACTIONAL option (available as of Phoenix 4.7) determines whether a table (and its secondary indexes) are tranactional. The default value is FALSE, but may be overriden with the phoenix.table.istransactional.default property. A table may be altered to become transactional, but it cannot be transitioned back to be non transactional. For more information on transactions, see http://phoenix.apache.org/transactions.html

The UPDATE_CACHE_FREQUENCY option (available as of Phoenix 4.7) determines how often the server will be checked for meta data updates (for example, the addition or removal of a table column or the updates of table statistics). Possible values are ALWAYS (the default), NEVER, and a millisecond numeric value. An ALWAYS value will cause the client to check with the server each time a statement is executed that references a table (or once per commit for an UPSERT VALUES statement).  A millisecond value indicates how long the client will hold on to its cached version of the metadata before checking back with the server for updates.

Example:

IMMUTABLE_ROWS=true
DEFAULT_COLUMN_FAMILY='a'
SALT_BUCKETS=10
DATA_BLOCK_ENCODING='NONE',a.VERSIONS=10
MAX_FILESIZE=2000000000,MEMSTORE_FLUSHSIZE=80000000
UPDATE_CACHE_FREQUENCY=300000

Hint

scanHint
indexHint
cacheHint
smallHint
joinHint
seekToColumnHint
serialHint
 
, ...

An advanced features that overrides default query processing behavior for decisions such as whether to use a range scan versus skip scan and an index versus no index. Note that strict parsing is not done on hints. If hints are misspelled or invalid, they are silently ignored.

Example:

SKIP_SCAN,NO_INDEX
NO_CACHE
INDEX(employee emp_name_idx emp_start_date_idx)
SMALL

Scan Hint

SKIP_SCAN
RANGE_SCAN

Use the SKIP_SCAN hint to force a skip scan to be performed on the query when it otherwise would not be. This option may improve performance if a query does not include the leading primary key column, but does include other, very selective primary key columns.

Use the RANGE_SCAN hint to force a range scan to be performed on the query. This option may improve performance if a query filters on a range for non selective leading primary key column along with other primary key columns

Example:

SKIP_SCAN
RANGE_SCAN

Cache Hint

NO_CACHE

Use the NO_CACHE hint to prevent the results of the query from populating the HBase block cache. This is useful in situation where you're doing a full table scan and know that it's unlikely that the rows being returned will be queried again.

Example:

NO_CACHE

Index Hint

INDEX
NO_INDEX
USE_INDEX_OVER_DATA_TABLE
USE_DATA_OVER_INDEX_TABLE

Use the INDEX(<table_name> <index_name>...) to suggest which index to use for a given query. Double quotes may be used to surround a table_name and/or index_name to make them case sensitive. As of the 4.3 release, this will force an index to be used, even if it doesn't contain all referenced columns, by joining back to the data table to retrieve any columns not contained by the index.

Use the NO_INDEX hint to force the data table to be used for a query.

Use the USE_INDEX_OVER_DATA_TABLE hint to act as a tiebreaker for choosing the index table over the data table when all other criteria are equal. Note that this is the default optimizer decision.

Use the USE_DATA_OVER_INDEX_TABLE hint to act as a tiebreaker for choosing the data table over the index table when all other criteria are equal.

Example:

INDEX(employee emp_name_idx emp_start_date_idx)
NO_INDEX
USE_INDEX_OVER_DATA_TABLE
USE_DATA_OVER_INDEX_TABLE

Small Hint

SMALL

Use the SMALL hint to reduce the number of RPCs done between the client and server when a query is executed. Generally, if the query is a point lookup or returns data that is likely in a single data block (64 KB by default), performance may improve when using this hint.

Example:

SMALL

Seek To Column Hint

SEEK_TO_COLUMN
NO_SEEK_TO_COLUMN

Use the SEEK_TO_COLUMN hint to force the server to seek to navigate between columns instead of doing a next. If there are many versions of the same column value or if there are many columns between the columns that are projected, then this may be more efficient.

Use the NO_SEEK_TO_COLUMN hint to force the server to do a next to navigate between columns instead of a seek. If there are few versions of the same column value or if the columns that are projected are adjacent to each other, then this may be more efficient.

Example:

SEEK_TO_COLUMN
NO_SEEK_TO_COLUMN

Join Hint

NO_STAR_JOIN

Use the NO_STAR_JOIN hint to prevent the optimizer from using the star join query to broadcast the results of the querying one common table to all region servers. This is useful when the results of the querying the one common table is too large and would likely be substantially filtered when joined against one or more of the other joined tables.

Example:

NO_STAR_JOIN

Serial Hint

SERIAL

Use the SERIAL hint to force a query to be executed serially as opposed to being parallelized along the guideposts and region boundaries.

Example:

SERIAL

Column Def

columnRef dataType
 
 
NOT
NULL
 
PRIMARY KEY
 
ASC
DESC
 
ROW_TIMESTAMP

Define a new primary key column. The column name is case insensitive by default and case sensitive if double quoted. The sort order of a primary key may be ascending (ASC) or descending. The default is ascending. Additionally, if the column is the only column that forms the primary key, then it can be designated as ROW_TIMESTAMP column provided its data type is one of these: BIGINT, UNSIGNED_LONG, DATE, TIME and TIMESTAMP.

Example:

id char(15) not null primary key
key integer null
m.response_time bigint

created_date date not null primary key row_timestamp
key integer null
m.response_time bigint

Table Ref

 
schemaName .
tableName

References a table or view with an optional schema name qualifier

Example:

Sales.Contact
HR.Employee
Department

Sequence Ref

 
schemaName .
sequenceName

References a sequence with an optional schema name qualifier

Example:

my_id_generator
my_seq_schema.id_generator

Column Ref

 
familyName .
columnName

References a column with an optional family name qualifier

Example:

e.salary
dept_name

Select Expression

*
( familyName . * )
expression
 
 
AS
columnAlias

An expression in a SELECT statement. All columns in a table may be selected using *, and all columns in a column family may be selected using <familyName>.*.

Example:

*
cf.*
ID AS VALUE
VALUE + 1 VALUE_PLUS_ONE

Select Statement

SELECT
 
/ * + hint * /
 
DISTINCT
ALL
selectExpression
 
, ...

FROM tableSpec
 
 
joinType
JOIN tableSpec ON expression
 
...

 
WHERE expression

 
GROUP BY expression
 
, ...
 
HAVING expression

Selects data from a table. DISTINCT filters out duplicate results while ALL, the default, includes all results. FROM identifies the table being queried. Columns may be dynamically defined in parenthesis after the table name and then used in the query. Joins are processed in reverse order through a broadcast hash join mechanism. For best performance, order tables from largest to smallest in terms of how many rows you expect to be used from each table. GROUP BY groups the the result by the given expression(s). HAVING filters rows after grouping. An optional hint may be used to override decisions made by the query optimizer.

Example:

SELECT * FROM TEST;
SELECT DISTINCT NAME FROM TEST;
SELECT ID, COUNT(1) FROM TEST GROUP BY ID;
SELECT NAME, SUM(VAL) FROM TEST GROUP BY NAME HAVING COUNT(1) > 2;
SELECT d.dept_id,e.dept_id,e.name FROM DEPT d JOIN EMPL e ON e.dept_id = d.dept_id;

Split Point

value
bindParameter

Defines a split point for a table. Use a bind parameter with preparedStatement.setBinary(int,byte[]) to supply arbitrary bytes.

Example:

'A'

Table Spec

aliasedTableRef
( select )
 
 
AS
tableAlias

An optionally aliased table reference, or an optionally aliased select statement in paranthesis.

Example:

PRODUCT_METRICS AS PM
PRODUCT_METRICS(referrer VARCHAR)
( SELECT feature FROM PRODUCT_METRICS ) AS PM

Aliased Table Ref

 
schemaName .
tableName
 
 
AS
tableAlias
 
( columnDef
 
, ...
)

A reference to an optionally aliased table optionally followed by dynamic column definitions.

Example:

PRODUCT_METRICS AS PM
PRODUCT_METRICS(referrer VARCHAR)

Join Type

INNER
LEFT
RIGHT
 
OUTER

The type of join

Example:

INNER
LEFT OUTER
RIGHT

Func Argument

dataType
 
CONSTANT
 
DEFUALTVALUE = string
 
MINVALUE = string
 
MAXVALUE = string

The function argument is sql data type. It can be constant and also we can provide default,min and max values for the argument in single quotes.

Example:

VARCHAR
INTEGER DEFAULTVALUE='100'
INTEGER CONSTANT DEFAULTVALUE='10' MINVALUE='1' MAXVALUE='15'

Class Name

String

Canonical class name in single quotes.

Example:

'com.mypackage.MyReverseFunction'

Jar Path

String

Hdfs path of jar in single quotes.

Example:

'hdfs://localhost:8080:/hbase/lib/myjar.jar'
'/tmp/lib/myjar.jar'

Order

expression
 
ASC
DESC
 
NULLS
FIRST
LAST

Sorts the result by an expression.

Example:

NAME DESC NULLS LAST

Expression

andCondition
 
OR andCondition
 
...

Value or condition.

Example:

ID=1 OR NAME='Hi'

And Condition

booleanCondition
 
AND booleanCondition
 
...

Condition separated by AND.

Example:

FOO!='bar' AND ID=1

Boolean Condition

 
NOT
condition

Boolean condition.

Example:

ID=1 AND NAME='Hi'

Condition

operand
 
=
<
>
< =
> =
< >
! =
rhsOperand
LIKE
ILIKE
operand
IS
 
NOT
NULL
 
NOT
IN (
select
constantOperand
 
, ...
)
EXISTS ( select )
BETWEEN operand AND operand

Boolean value or condition. When comparing with LIKE, the wildcards characters are _ (any one character) and % (any characters). ILIKE is the same, but the search is case insensitive. To search for the characters % and _, the characters need to be escaped. The escape character is \ (backslash). Patterns that end with an escape character are invalid and the expression returns NULL. BETWEEN does an inclusive comparison for both operands.

Example:

FOO = 'bar'
NAME LIKE 'Jo%'
IN (1, 2, 3)
NOT EXISTS (SELECT 1 FROM FOO WHERE BAR < 10)
N BETWEEN 1 and 100

RHS Operand

operand
ANY
ALL
(
operand
select
)

Right-hand side operand

Example:

s.my_col
ANY(my_col + 1)
ALL(select foo from bar where bas > 5)

Operand

summand
 
|| summand
 
...

A string concatenation.

Example:

'foo'|| s

Summand

factor
 
+
-
factor
 
...

An addition or subtraction of numeric or date type values

Example:

a + b
a - b

Factor

term
 
*
/
%
term
 
...

A multiplication, division, or modulus of numeric type values.

Example:

c * d
e / 5
f % 10

Term

value
( expression )
bindParameter
Function
case
caseWhen
 
tableAlias .
columnRef
rowValueConstructor
cast
sequence
arrayConstructor
 
[ expression ]

A term which may use subscript notation if it's an array.

Example:

'Hello'
23
my_array[my_index]
array_col[1]

Array Constructor

ARRAY [ expression
 
, ...
]

Constructs an ARRAY out of the list of expressions.

Example:

ARRAY[1.0,2.2,3.3]
ARRAY['foo','bas']
ARRAY[col1,col2,col3+1,?]

Sequence

NEXT
CURRENT
VALUE
number VALUES
FOR sequenceRef

Gets the CURRENT or NEXT value for a sequence, a monotonically incrementing BIGINT value. Each call to NEXT VALUE FOR increments the sequence value and returns the current value. The NEXT <n> VALUES syntax may be used to reserve <n> consecutive sequence values. A sequence is only increment once for a given statement, so multiple references to the same sequence by NEXT VALUE FOR produce the same value. Use CURRENT VALUE FOR to access the last sequence allocated with NEXT VALUE FOR for cluster connection of your client. If no NEXT VALUE FOR had been previously called, an error will occur. These calls are only allowed in the SELECT expressions or UPSERT VALUES expressions.

Example:

NEXT VALUE FOR my_table_id
NEXT 5 VALUES FOR my_table_id
CURRENT VALUE FOR my_schema.my_id_generator

Cast

CAST ( expression AS dataType )

The CAST operator coerces the given expression to a different dataType. This is useful, for example, to convert a BIGINT or INTEGER to a DECIMAL or DOUBLE to prevent truncation to a whole number during arithmetic operations. It is also useful to coerce from a more precise type to a less precise type since this type of coercion will not automatically occur, for example from a TIMESTAMP to a DATE. If the coercion is not possible, an error will occur.

Example:

CAST ( my_int AS DECIMAL )
CAST ( my_timestamp AS DATE )

Row Value Constructor

( expression , expression
 
...
)

A row value constructor is a list of other terms which are treated together as a kind of composite structure. They may be compared to each other or to other other terms. The main use case is 1) to enable efficiently stepping through a set of rows in support of query-more type functionality, or 2) to allow IN clause to perform point gets on composite row keys.

Example:

(col1, col2, 5)

Bind Parameter

?
: number

A parameters can be indexed, for example :1 meaning the first parameter.

Example:

:1
?

Value

string
numeric
boolean
null

A literal value of any data type, or null.

Example:

10

Case

CASE term WHEN expression THEN term
 
...

 
ELSE expression
END

Returns the first expression where the value is equal to the test expression. If no else part is specified, return NULL.

Example:

CASE CNT WHEN 0 THEN 'No' WHEN 1 THEN 'One' ELSE 'Some' END

Case When

CASE WHEN expression THEN term
 
...

 
ELSE term
END

Returns the first expression where the condition is true. If no else part is specified, return NULL.

Example:

CASE WHEN CNT<10 THEN 'Low' ELSE 'High' END

Name

A-Z | _
 
A-Z | _
0-9
 
...
quotedName

Unquoted names are not case sensitive. There is no maximum name length.

Example:

my_column

Quoted Name

" anything "

Quoted names are case sensitive, and can contain spaces. There is no maximum name length. Two double quotes can be used to create a single double quote inside an identifier.

Example:

"first-name"

Alias

name

An alias is a name that is only valid in the context of the statement.

Example:

A

Null

NULL

NULL is a value without data type and means 'unknown value'.

Example:

NULL

Data Type

sqlDataType
hbaseDataType
 
ARRAY
 
[
 
dimensionInt
]

A type name optionally declared as an array. An array is mapped to java.sql.Array. Only single dimension arrays are supported and varbinary arrays are not allowed.

Example:

CHAR(15)
VARCHAR
DECIMAL(10,2)
DOUBLE
DATE
VARCHAR ARRAY
CHAR(10) ARRAY [5]
INTEGER []

SQL Data Type

charType
varcharType
decimalType
tinyintType
smallintType
integerType
bigintType
floatType
doubleType
timestampType
dateType
timeType
binaryType
varbinaryType

A standard SQL data type.

Example:

TINYINT
CHAR(15)
VARCHAR
VARCHAR(1000)
DECIMAL(10,2)
DOUBLE
INTEGER
BINARY(200)
DATE

HBase Data Type

unsignedTimestampType
unsignedDateType
unsignedTimeType
unsignedTinyintType
unsignedSmallintType
unsignedIntType
unsignedLongType
unsignedFloatType
unsignedDoubleType

A type that maps to a native primitive HBase value serialized through the Bytes.toBytes() utility methods. Only positive values are allowed.

Example:

UNSIGNED_INT
UNSIGNED_DATE
UNSIGNED_LONG

String

' anything '

A string starts and ends with a single quote. Two single quotes can be used to create a single quote inside a string.

Example:

'John''s car'

Boolean

TRUE
FALSE

A boolean value.

Example:

TRUE

Numeric

int
long
decimal

The data type of a numeric value is always the lowest possible for the given value. If the number contains a dot this is decimal; otherwise it is int, long, or decimal (depending on the value).

Example:

SELECT -10.05
SELECT 5
SELECT 12345678912345

Int

 
-
number

The maximum integer number is 2147483647, the minimum is -2147483648.

Example:

10

Long

 
-
number

Long numbers are between -9223372036854775808 and 9223372036854775807.

Example:

100000

Decimal

 
-
number
 
. number

A decimal number with fixed precision and scale. Internally, java.lang.BigDecimal is used.

Example:

SELECT -10.5

Number

0-9
 
...

The maximum length of the number depends on the data type used.

Example:

100

Comments

- - anything
/ / anything
/ * anything * /

Comments can be used anywhere in a command and are ignored by the database. Line comments end with a newline. Block comments cannot be nested, but can be multiple lines long.

Example:

// This is a comment

Back to top