Introduction to Bitmap Indexes
To alleviate confusion, I will refer to bit-indexing and bit mapped indexing as a bitmap index. Through my research I have seen these terms used interchangeably to define the same concept – bitmap indexes.
What is a bitmap index and how do they work?
Bitmap indexes are a mechanism, largely employed by Oracle databases, to increase search performance on large data sets. Bitmap indexes are most effective when applied to columns that exhibit low cardinality. In this case, cardinality represents the amount of unique values that a column may contain. For example, a column called “Active” on a user account table would likely only contain two values: true or false (or active and disabled). Regardless of the total number of tuples, the values contained in this column is only two possible values. This column exhibits low cardinality. A column that exhibits high cardinality benefit less from creating an index on them and become candidates for a primary key – that is if each tuple contains a unique value.
When a bitmap index is created on a column, an index is created that contains columns that match the records in the table/column that is being indexed, with rows that represent the possible values. To help illustrate how bitmap indexes work, I will begin with a table of sample data that represents a “User” table, the table will have the following structure:
If we create a bitmap index on the “Active” column, that index would logically look like the following:
|Active column||Record 1
How does this structure benefit us? We can look at a simple query to help get a better understanding, take a simple SELECT with a WHERE clause:
SELECT * FROM Users WHERE Active = ‘Active’
Without an index, our database system has to compare each record’s Active column with the value in our where clause, in this case ‘Active’. This means that every record in the table has to be read and the values compared (we do a full table scan). When we create an index, the database system can now access the index and look for all columns that have a value of 1, using the corresponding record ID to then lookup only the information required from the underlying table. While the implementation can vary, the bitmap index often times is loaded into memory on the server, so scanning the index can be done much quicker than doing a full table scan relying on reads from a physical disk. This method also utilizes much less data, as reading an integer value in the index is going to be less than retrieving a larger record during a full table scan.
Advantages and Disadvantages of bitmap indexes
Advantages of a bitmap index are the most significant when the cardinality of the column they are being created from is low. When an index is created, it allows for bitwise operations when the database system is utilizing the index for search results. Without getting into the details of bitwise operations, they offer a significant increase in performance as the database system can do operations like bitwise AND and bitwise OR. Bitmap indexes are also a highly compressed data structure, which makes them efficient to read. This highly compressed attribute is also beneficial when storing the index on disk and subsequently loading them into memory for use.
While our discussion has focused on a simple query with a single index, bitmap indexes can dramatically increase query performance when combined with other bitmap indexes. When multiple columns have low cardinality, bitmap indexes can be created for each. Now when complex queries, that is queries with complex filters (where/on clause, et cetera) are executed against these columns, the query can benefit from the performance increase from the bitmap indices instead of relying on multiple full table scans.
While the advantages of the bitmap index seem to be numerous, they do have their downside. Bitmap indexes are typically employed in a data warehouse environment where data is frequently read and there is minimal writes. The biggest drawback to a bitmap index is when it comes to writing, especially in a concurrent environment. While an index is being updated, it will lock the table while the update is in progress. Once the update is complete, the index will be released. This can cause significant issues, especially if running in a concurrent environment, as other queries that utilize the index have to wait for it to be released. There is also more overhead in simply maintaining the index and they are best suited in environments where INSERT/UPDATE/DELETE operations are done in bulk.
As previously mentioned, bitmap indexes are most efficient on columns with low cardinality. Applying a bitmap index on a column with high cardinality may result in queries actually taking longer to execute. I say may as I have encountered some opinions that argue that bitmap indexes may perform well on high cardinality data. However, Oracle’s recommendation is to use them on low cardinality columns (Oracle).
Bitmap indexes can significantly reduce query time when implemented on appropriate columns. The decision to use them is largely dependent on the columns exhibiting the characteristics discussed in this document. Additionally, database systems often provide performance monitoring tools which can be leveraged to measure query time and therefore gauge the effectiveness of creating a bitmap index (or conversely removing one). With the overhead of maintaining these indexes, they are best suited for a data warehouse environment, one in which there is a high order of reads and minimal writes. Bitmap indexes are a great resource when leveraged properly.
“Oracle Bitmap Index Techniques.” Oracle Bitmap Index. Burleson Consulting, n.d. Web. 19 Sept. 2014.
Oracle. “Bookshelf V7.5: Bitmap Indexes.” Bookshelf V7.5: Bitmap Indexes. Oracle, 18 Apr. 2003. Web. 19 Sept. 2014.
“Part 4 – The Bitmap Legacy.” Welcome to The Oracle FAQ. Oracle FAQ’s, 29 May 201. Web. 19 Sept. 2014.
Sharma, Vivek. “Bitmap Index vs. B-tree Index: Which and When?” Bitmap Index vs. B-tree Index: Which and When? Oracle, 2005. Web. 19 Sept. 2014.