Phoenix favicon

Apache Phoenix

Features

PhoenixSyncTable Tool

Detect data divergence between a source and a target Phoenix table across two HBase clusters via a chunked hash comparison driven by MapReduce.

PhoenixSyncTableTool is a MapReduce-based divergence detector for Phoenix tables that are replicated (or migrated) between two HBase clusters. For each region-aligned chunk it computes an SHA-256 hash on both clusters server-side and compares only the hashes — full rows never leave their cluster. Chunks whose hashes disagree are checkpointed to a Phoenix output table (PHOENIX_SYNC_TABLE_CHECKPOINT) for later inspection. Available in Phoenix 5.3.1 (PHOENIX-7751).

The tool is conceptually similar to HBase's HashTable/SyncTable pair but is Phoenix-aware (honors tenant id, indexes, the column-encoding scheme, and a bounded time range via --from-time/--to-time) and runs as a single MapReduce job, writing results directly to a Phoenix table instead of staging hashes in HDFS between two jobs. The output table is queryable with SQL.

Two operational properties differentiate it further from HashTable/SyncTable:

  • Resumable via checkpointing. Both mapper-region completion and per-chunk progress are persisted to the checkpoint table during the run. On a failure or re-run with the same (table, target cluster, from-time, to-time) window, completed mapper regions are filtered out of the input splits and finished chunks are skipped — no need to redo verified work.
  • Optional split coalescing (--coalesce-split). When enabled, adjacent region splits co-located on the same RegionServer are grouped into a single mapper, reducing mapper count (and target-cluster RPC fan-out) on tables with many small regions. Off by default; enable for wide tables where per-mapper overhead dominates.

PhoenixSyncTableTool performs detection only in 5.3.1; it does not modify the target cluster.

When to use it

Reach for PhoenixSyncTableTool to verify:

  • A cluster migration that used HBase snapshots, replication, or both — to confirm the target is byte-for-byte identical after cutover.
  • Long-running HBase replication — to detect cases where a replication peer has silently drifted.
  • DR drills — to confirm the standby is in sync before a planned failover.

For ad-hoc row-count or row-key spot-checks you usually want a small SQL query instead; PhoenixSyncTableTool is the right choice when you need full-data confidence with bounded network cost.

Running the tool

The tool runs through hbase (or hadoop jar) and takes only two mandatory flags — the source table name and the target cluster's ZooKeeper quorum.

hbase org.apache.phoenix.mapreduce.PhoenixSyncTableTool \
  --table-name MY_SCHEMA.MY_TABLE \
  --target-cluster zk1,zk2,zk3:2181:/hbase \
  --run-foreground

The source cluster comes from the Hadoop/HBase configuration the job is submitted under, so --target-cluster is the ZooKeeper quorum of the other cluster. Accepted quorum formats:

  • host:port:/znode
  • h1,h2:port:/znode
  • h1:p1,h2:p2:/znode

Flags

ShortLongRequiredDefaultPurpose
-tn--table-nameyesSource table (physical name; index physical names are also accepted).
-tc--target-clusteryesZK quorum of the target cluster.
-s--schemanoPhoenix schema name.
-tenant--tenant-idnoTenant id for tenant-specific sync.
-ft--from-timeno0Lower bound of the cell-timestamp window, in ms.
-tt--to-timenonow - 1 hourUpper bound; also used as CURRENT_SCN. The 1-hour buffer gives async replication time to catch up.
-cs--chunk-sizeno1073741824 (1 GiB)Approximate chunk size in bytes. Smaller chunks narrow the divergence search radius at the cost of more checkpoint rows.
-rs--raw-scannofalseInclude delete markers.
-rav--read-all-versionsnofalseCompare every cell version, not just the latest.
-coal--coalesce-splitnofalseCoalesce multiple source regions into one mapper.
-runfg--run-foregroundnofalseBlock until the job completes (default is fire-and-forget submit).
-dr--dry-runnofalseMarker only — reserved for a future auto-repair extension.
-h--helpnoPrint help and exit.

The mapper count is implicitly the number of source-table regions (one mapper per region) unless --coalesce-split is set.

Output

MapReduce counters

When --run-foreground is set, the tool logs counters from the PhoenixSyncTableMapper$SyncCounters group:

  • MAPPERS_VERIFIED, MAPPERS_MISMATCHED
  • CHUNKS_VERIFIED, CHUNKS_MISMATCHED
  • SOURCE_ROWS_PROCESSED, TARGET_ROWS_PROCESSED

PHOENIX_SYNC_TABLE_CHECKPOINT

The tool auto-creates a Phoenix table on the source cluster (90-day TTL, Snappy compression) with one row per chunk and per region. To list divergences from the last run:

SELECT START_ROW_KEY, END_ROW_KEY, COUNTERS, EXECUTION_END_TIME
FROM   PHOENIX_SYNC_TABLE_CHECKPOINT
WHERE  TABLE_NAME = 'MY_TABLE'
  AND  TARGET_CLUSTER = 'zk1,zk2,zk3:2181:/hbase'
  AND  TYPE = 'CHUNK'
  AND  STATUS = 'MISMATCHED';

Each row carries STATUS (VERIFIED or MISMATCHED), TYPE (CHUNK or REGION), the key range, and a comma-separated COUNTERS string with per-chunk source and target row counts.

Resumability

A re-run of the same (table, target, from-time, to-time, tenant) tuple picks up where the previous run left off — already-verified sub-ranges are skipped.

Prerequisites

  • Cross-cluster line of sight. Mapper YARN nodes need ZooKeeper and RPC reachability to both clusters' RegionServers.
  • Both clusters must run Phoenix 5.3.1+.
  • Live read, not snapshot-based. Both clusters are scanned through the regular Phoenix read path.
  • Kerberos delegation tokens for the target cluster are acquired automatically when security is enabled.
  • The submitter principal needs READ on the physical HBase tables on both clusters, plus WRITE to PHOENIX_SYNC_TABLE_CHECKPOINT on the source.
  • Views and logical (not physical) index names are rejected. Pass the physical index table name to validate an index.

Tuning

--chunk-size is the main lever:

  • Larger chunks (e.g. 4 GiB) reduce checkpoint rows and per-chunk overhead but make every mismatch report a coarser range.
  • Smaller chunks (e.g. 64 MiB) narrow the mismatch search radius and produce more checkpoint rows.

The tool runs at long-scan timescales. Adjust these client-side timeouts (set in the Hadoop Configuration the job is submitted with) if you see scanner timeouts on very large regions:

PropertyDefault
phoenix.sync.table.query.timeout~150 minutes
phoenix.sync.table.rpc.timeout30 minutes
phoenix.sync.table.client.scanner.timeout30 minutes
phoenix.sync.table.rpc.retries.counter5

Limitations

  • Detection only. Mismatched chunks are recorded but not repaired in 5.3.1 (see Upcoming).
  • No views. Only physical tables and index physical names are accepted.

Upcoming

A future iteration of the tool will add a repair phase that converts recorded MISMATCHED chunks into upserts on the target cluster, turning detection into verify-and-repair in a single workflow. The current --dry-run flag is reserved for that mode.

Edit on GitHub

On this page