Query Server

The Phoenix Query Server provides an alternative means for interaction with Phoenix and HBase.

Overview

Phoenix 4.4 introduces a stand-alone server that exposes Phoenix to "thin" clients. It is based on the Avatica component of Apache Calcite. The query server is comprised of a Java server that manages Phoenix Connections on the clients' behalf.

With the introduction of the Protobuf transport, Avatica is moving towards backwards compatibility with the provided thin JDBC driver. There are no such backwards compatibility guarantees for the JSON API.

To repeat, there is no guarantee of backwards compatibility with the JSON transport; however, compatibility with the Protobuf transport is stabilizing (although, not tested thoroughly enough to be stated as "guaranteed").

Clients

The primary client implementation is currently a JDBC driver with minimal dependencies. The default and primary transport mechanism since Phoenix 4.7 is Protobuf, the older JSON mechanism can still be enabled. The distribution includes the sqlline-thin.py CLI client that uses the JDBC thin client.

The Phoenix project also maintains the Python driver phoenixdb.

The Avatica Go client can also be used.

Proprietary ODBC drivers are also available for Windows and Linux.

Installation

In the 4.4-4.14 and 5.0 releases the query server and its JDBC client are part of the standard Phoenix distribution. They require no additional dependencies or installation.

After the 4.15 and 5.1 release, the query server has been unbundled into the phoenix-queryserver repository, and its version number has been reset to 6.0.

Download the latest source or binary release from the Download page, or check out the development version from GitHub.

Either unpack the binary distribution, or build it from source. See BUILDING.md in the source distribution on how to build.

Usage

Server

The standalone Query Server distribution does not contain the necessary Phoenix (thick) client library by default.

If using the standalone library you will either need to rebuild it from source to include the client library (See BUILDING.md), or manually copy the phoenix thick client library into the installation directory.

The server component is managed through bin/queryserver.py. Its usage is as follows

bin/queryserver.py [start|stop]

When invoked with no arguments, the query server is launched in the foreground, with logging directed to the console.

The first argument is an optional start or stop command to the daemon. When either of these are provided, it will take appropriate action on a daemon process, if it exists.

Any subsequent arguments are passed to the main class for interpretation.

The server is packaged in a standalone jar, phoenix-queryserver-<version>.jar. This jar, the phoenix-client.jar and HBASE_CONF_DIR on the classpath are all that is required to launch the server.

Client

Phoenix provides two mechanisms for interacting with the query server. A JDBC driver is provided in the standalone phoenix-queryserver-client-<version>.jar. The script bin/sqlline-thin.py is available for the command line.

The JDBC connection string is composed as follows:

jdbc:phoenix:thin:url=<scheme>://<server-hostname>:<port>[;option=value...]

<scheme> specifies the transport protocol (http or https) used when communicating with the server.

<server-hostname> is the name of the host offering the service.

<port> is the port number on which the host is listening. Default is 8765, though this is configurable (see below).

The full list of options that can be provided via the JDBC URL string is available in the Avatica documentation.

The script bin/sqlline-thin.py is intended to behave identically to its sibling script bin/sqlline.py. It supports the following usage options.

bin/sqlline-thin.py [[scheme://]host[:port]] [sql_file]

The first optional argument is a connection URL, as described previously. When not provided, scheme defaults to http, host to localhost, and port to 8765.

bin/sqlline-thin.py http://localhost:8765

The second optional parameter is a sql file from which to read commands.

Wire API documentation

The API itself is documented in the Apache Calcite project as it is the Avatica API -- there is no wire API defined in Phoenix itself.

JSON API

Protocol Buffer API

For more information in building clients in other languages that work with Avatica, please feel free to reach out to the Apache Calcite dev mailing list.

Impersonation

By default, the Phoenix Query Server executes queries on behalf of the end-user. HBase permissions are enforced given the end-user, not the Phoenix Query Server's identity. In some cases, it may be desirable to execute the query as some other user -- this is referred to as "impersonation". This can enable workflows where a trusted user has the privilege to run queries for other users.

This can be enabled by setting the configuration property phoenix.queryserver.withRemoteUserExtractor to true. The URL of the Query Server can be modified to include the required request parameter. For example, to let "bob" to run a query as "alice", the following JDBC URL could be used:

jdbc:phoenix:thin:url=http://localhost:8765?doAs=alice

The standard Hadoop "proxyuser" configuration keys are checked to validate if the "real" remote user is allowed to impersonate the "doAs" user. See the Hadoop documentation for more information on how to configure these rules.

As a word of warning: there is no end-to-end test coverage for the HBase 0.98 and 1.1 Phoenix releases because of missing test-related code in those HBase releases. While we expect no issues on these Phoenix release lines, we recommend additional testing by the user to verify that there are no issues.

Metrics

By default, the Phoenix Query Server exposes various Phoenix global client metrics via JMX (for HBase versions 1.3 and up). The list of metrics are available here.

PQS Metrics use Hadoop Metrics 2 internally for metrics publishing. Hence it publishes various JVM related metrics. Metrics can be filtered based on certain tags, which can be configured by the property specified in hbase-site.xml on the classpath. Further details are provided in Configuration section.

Configuration

Server components are spread across a number of java packages, so effective logging configuration requires updating multiple packages. The default server logging configuration sets the following log levels:

log4j.logger.org.apache.calcite.avatica=INFO
log4j.logger.org.apache.phoenix.queryserver.server=INFO
log4j.logger.org.eclipse.jetty.server=INFO

As of the time of writing, the underlying Avatica component respects the following configuration options exposed via hbase-site.xml.

Server Instantiation

Property	Description	Default
`phoenix.queryserver.http.port`	Port the server listens on.	`8765`
`phoenix.queryserver.metafactory.class`	Avatica `Meta.Factory` implementation class.	`org.apache.phoenix.queryserver.server.PhoenixMetaFactoryImpl`
`phoenix.queryserver.serialization`	Transport/serialization format (`PROTOBUF` or `JSON`).	`PROTOBUF`

HTTPS

HTTPS support is only available in unbundled phoenix-queryserver versions.

Property	Description	Default
`phoenix.queryserver.tls.enabled`	Enables HTTPS transport. When enabled, keystore/truststore files and passwords are also required.	`false`
`phoenix.queryserver.tls.keystore`	Keystore file containing the HTTPS private key.	unset
`phoenix.queryserver.tls.keystore.password`	Password for HTTPS keystore.	empty string
`phoenix.queryserver.tls.truststore`	Keystore file containing the HTTPS certificate.	unset
`phoenix.queryserver.tls.truststore.password`	Password for HTTPS truststore.	empty string

Secure Cluster Connection

Property	Description	Default
`hbase.security.authentication`	When set to `kerberos`, server logs in before initiating Phoenix connections.	specified in `hbase-default.xml`
`phoenix.queryserver.keytab.file`	Key for keytab file lookup.	unset
`phoenix.queryserver.kerberos.principal`	Kerberos principal for authentication; also used for SPNEGO if HTTP principal is not configured.	unset
`phoenix.queryserver.http.keytab.file`	Keytab for SPNEGO auth; required if `phoenix.queryserver.kerberos.http.principal` is set; falls back to `phoenix.queryserver.keytab.file`.	unset
`phoenix.queryserver.http.kerberos.principal`	Kerberos principal for SPNEGO auth; falls back to `phoenix.queryserver.kerberos.principal`.	unset
`phoenix.queryserver.kerberos.http.principal`	Deprecated; use `phoenix.queryserver.http.kerberos.principal`.	unset
`phoenix.queryserver.kerberos.allowed.realms`	Additional Kerberos realms allowed for SPNEGO auth.	unset
`phoenix.queryserver.dns.nameserver`	DNS hostname.	`default`
`phoenix.queryserver.dns.interface`	Network interface name for DNS queries.	`default`

Server Connection Cache

Property	Description	Default
`avatica.connectioncache.concurrency`	Connection cache concurrency level.	`10`
`avatica.connectioncache.initialcapacity`	Connection cache initial capacity.	`100`
`avatica.connectioncache.maxcapacity`	Connection cache maximum capacity; LRU eviction begins near this point.	`1000`
`avatica.connectioncache.expiryduration`	Connection cache expiration duration.	`10`
`avatica.connectioncache.expiryunit`	Time unit for `avatica.connectioncache.expiryduration`.	`MINUTES`

Server Statement Cache

Property	Description	Default
`avatica.statementcache.concurrency`	Statement cache concurrency level.	`100`
`avatica.statementcache.initialcapacity`	Statement cache initial capacity.	`1000`
`avatica.statementcache.maxcapacity`	Statement cache maximum capacity; LRU eviction begins near this point.	`10000`
`avatica.statementcache.expiryduration`	Statement cache expiration duration.	`5`
`avatica.statementcache.expiryunit`	Time unit for `avatica.statementcache.expiryduration`.	`MINUTES`

Impersonation

Property	Description	Default
`phoenix.queryserver.withRemoteUserExtractor`	If true, extracts impersonated user from request param instead of authenticated HTTP user.	`false`
`phoenix.queryserver.remoteUserExtractor.param`	HTTP request parameter name for impersonated user.	`doAs`

Metrics

Property	Description	Default
`phoenix.client.metrics.tag`	Tag for filtering Phoenix global client metrics emitted by PQS in `hadoop-metrics2.properties`.	`FAT_CLIENT`

Query Server Additions

The Phoenix Query Server is meant to be horizontally scalable which means that it is a natural fit for add-on features like service discovery and load balancing.

Load balancing

The Query Server can use off-the-shelf HTTP load balancers such as the Apache HTTP Server, nginx, or HAProxy. The primary requirement of using these load balancers is that the implementation must implement "sticky session" (when a client communicates with a backend server, that client continues to talk to that backend server). The Query Server also provides some bundled functionality for load balancing using ZooKeeper.

The ZooKeeper-based load balancer functions by automatically registering PQS instances in ZooKeeper and then allows clients to query the list of available servers. This implementation, unlike the others mentioned above, requires that client use the advertised information to make a routing decision. In this regard, this ZooKeeper-based approach is more akin to a service-discovery layer than a traditional load balancer. This load balancer implementation does not support SASL-based (Kerberos) ACLs in ZooKeeper (see PHOENIX-4085).

The following properties configure this load balancer:

Property	Description	Default
`phoenix.queryserver.loadbalancer.enabled`	If true, PQS registers itself in ZooKeeper for load balancing.	`false`
`phoenix.queryserver.base.path`	Root znode where PQS instances register themselves.	`/phoenix`
`phoenix.queryserver.service.name`	Unique name to identify this PQS instance.	`queryserver`
`phoenix.queryserver.zookeeper.acl.username`	Username for optional DIGEST ZooKeeper ACL.	`phoenix`
`phoenix.queryserver.zookeeper.acl.password`	Password for optional DIGEST ZooKeeper ACL.	`phoenix`

Query Server

On this page