Coprocessor based Secondary Index on HBase

Anoop Sam John - Senior Software Engineer at Intel in the Big Data Platform Engineering group and HBase Committer. Prior experience as Platform Engineer in Huawei Technologies with Big Data and Cloud technologies.

Ramkrishna S Vasudevan – Senior Software Engineer at Intel in the Big Data Platform Engineering group, HBase Committer and PMC member.  Prior experience as Platform Engineer in Huawei Technologies with Big Data and Cloud technologies.

Secondary indexing feature is one of the most talked/requested features in HBase. There have been multiple attempts for implementation of a secondary indexing feature.

HBase stores data in lexicographical order of its rowkeys (RK). The RK becomes the primary and the only index in HBase. When some one knows the RK ranges the scan becomes easy as it needs to fetch only limited data.  But when the scan is required based on a column value or range of values, it becomes inefficient, as we need to scan the full table and do the filtering based on column values. (HBase provides SingleColumnValueFilter which can be used in this case) This makes the selection of the RK model for an HBase table very difficult task.

Index on columns

In the world of database, index on a column will make the retrieval of data based on column value highly efficient and fast. The basic idea is to map that columns value with the actual row references in some way. So when the data retrieval is needed based on the indexed column value, first from the index we can know the actual table row references and from there we can do the actual data fetch. Here by it avoids the full table scan.

HBase Secondary Index

Implementing a secondary index for a distributed system like HBase is not a straight forward task. How we maintain and use the index meta data is a troublesome question. We must store this meta data also as distributed.

In HBase, secondary index can be implemented with two approaches.

  • Client side implementation where client side separately handles the index meta data. During write the index meta data is created and written along with actual data. During scan, the meta data is first read to the client side and using that information the actual data is read back from the table.
  • Server side implementation where the server only handles the index meta data during writes and reads.

Each of the above approaches having its own advantages and disadvantages.


Recently Huawei open sourced an implementation for HBase secondary indexing [ Also see the issue HBASE-9203]. This is with the second approach of server side implementation. The implementation uses HBase’s Co-processor (CP) feature and so the implementation is 100% pluggable with minimal changes to HBase core. [The changes required in the core code have been added to core as new CP hooks.]

The index meta data is stored in another HBase table. The meta table is created on actual table creation using the CP hooks at HMaster side. Similarly the meta table is deleted on deletion of the actual table. The meta data is written to index table using the CP hooks at region server side. The CP hooks extract out data from the actual table data and creates meta data to be written to the index table. Similar way on delete of data from the actual table, corresponding data in the index table will also be deleted using the CP hooks implementation in HIndex.

When index table is created, it also has regions which have been balanced across the cluster. When the CP hook has to write/read index data it will be inefficient to read/write it across servers. Doing an RPC call from a CP hook (Which is executed along with an RPC) is an anti-pattern which is normally discouraged. If we can read/write index data into the same RS then the extra RPC cost can be avoided. In other words, if we can make a relation between the regions in main table and index table and if these regions can be collocated in the same RS that will be great.  HIndex implementation is doing below way to achieve this.

It maintains per region index. So it will create an index table with same number of regions in actual table and having same RK range. This RK ranges forms the region relation. Also it uses a custom load balancer on top of the HBase load balancer. This load balancer’s responsibility is to maintain the collocation of the regions. Whenever there is split or movement for the actual table regions, similar actions are taken for the index regions also (using CP hooks). Also there will be only one index table per actual table irrespective of the number of indices on it. All the index meta data will get stored in this single index table.

Writing data with index

The data for the index table is extracted from the Puts of the main table.  The CP creates the index table RK:

Index table RK = region startkey + index name + indexed column(s) value(s) + user table RK

Actual table RK is added into the index table RK as the last part. This makes the index table tall narrow which is better in many ways. Indexed column value(s) comes in the leading part so as to fetch only the needed data from index table during a scan with indexed column value/ value range. As it stores the entire index meta data, related to a table, in a single index table, it adds the index name also as part of the RK.  The RK starts with region start key so as to make the data allowed to written into this index table region. When index is on more than one column, all those column values are appended in the RK.

The Put object can be created as normal using the HBase client APIs. The CP implementation will decide what index meta data to be stored and will create it and store.

Reading data with index

A scan on the user table would create a scanner on the index table also. Based on the condition on the indexed column, the required data can be fetched from the index table. The actual table RK will be extracted out from the index data and the CP seeks to exact rows in the actual table. This avoids the read of each and every row from the actual table and filtering based on the condition. HIndex is intelligent enough to decide which whether to use index data for a particular scan. If there are multiple indices on the table, it can decide which index(s) to be used for a particular scan. The Scan object at the client side can be normal as before. The CP hooks inspects the SingleColumnValueFilter(s) passed in Scan so as to decide on the index(s) to be used.

The HIndex suppports the following features

  • Multiple indexes on table
  • Multi column index
  • Index on part of a column value
  • Usage of index on user table scan with column value/range of values on indexed columns
  • Bulk loading data to indexed table

HBase configuration













Future Work:


  • Dynamically add/drop index on tables
  • Shell support for table index management
  • Optimize range scan scenarios
  • Custom blooms to speed up reads from index table.
  • HBCK tool support for Secondary index tables
  • Make Scan Evaluation Intelligence Pluggable

[See for all the open issues]

Anoop & Ramkrishna