Skip to content

Hash based memtable implementations

haoboxu edited this page Jan 17, 2014 · 1 revision

Other than the default memtable implementation using skip lists, users can use other types of memtable implementation, for example HashLinkList and HashSkipList, to speed-up some queries.

As their names imply, HashSkipList organizes data in a hash table with each hash bucket to be a skip list, while HashLinkList organizes data in a hash table with each hash bucket as a sorted single linked list. Both types are built to reduce number of comparisons when doing queries. One good use case is to combine them with PlainTable SST format and store data in RAMFS.

When doing a look-up or inserting a key, target key's prefix is retrieved using Options.prefix_extractor, which is used to find the hash bucket. Inside a hash bucket, all the comparisons are done using whole (internal) keys, just as SkipList based memtable.

Comparison:

Mem Table Type SkipList HashSkipList HashLinkList
Optimized Use Case General Range query within a specific key prefix Range query within a specific key prefix and there are only a small number of rows for each prefix
Index type binary search hash + binary search hash + linear search
Support totally ordered full db scan? naturally very costly (copy and sort to create a temporary totally-ordered view) very costly (copy and sort to create a temporary totally-ordered view)
Memory Overhead Average (multiple pointers per entry) High (Hash Buckets + Skip List Metadata for non-empty buckets + multiple pointers per entry) Lower (Hash buckets + pointer per entry)
MemTable Flush Fast with constant extra memory Slow with high temporary memory usage Slow with high temporary memory usage

The biggest limitation of the hash based memtables is that doing scan across multiple prefixes requires copy and sort, which is very slow and memory costly.

Clone this wiki locally