banner
Tech Whims

Tech Whims

Allen Tech , 大数据技术 个人主站:https://techwhims.com/

Log Structured Merge Tree

Recorded on April 6, 2023, originated from the study of HBase principles.

Core Idea of LSM#

Core Idea of LSM

LSM tree consists of the following three important components:

  1. MemTable

    MemTable is a data structure in memory used to store recently updated data. It organizes this data in an ordered manner based on keys. The LSM tree does not have a specific data structure definition for organizing data in order. For example, HBase uses skip lists to ensure the order of keys in memory.

    Because the data is temporarily stored in memory, which is not reliable storage, data reliability is usually ensured through Write-ahead Logging (WAL).

  2. Immutable MemTable

    When the MemTable reaches a certain size, it is converted into an Immutable MemTable. Immutable MemTable is an intermediate state where MemTable is transformed into SSTable. Write operations are handled by the new MemTable, and data update operations are not blocked during the conversion process.

  3. SSTable (Sorted String Table)

    SSTable is a data structure in the LSM tree stored on disk, consisting of ordered key-value pairs. To speed up the reading of SSTable, an index of keys and a Bloom filter can be established to accelerate key lookup.
    Sorted String Table

Compact Strategy of LSM Tree#

Two basic strategies are mainly introduced: size-tiered and leveled.

Important concepts:

  • Read amplification: The amount of data actually read when reading data is greater than the actual amount of data. For example, in the LSM tree, it is necessary to check whether the current key exists in the MemTable first, and if not, continue to search from the SSTable.
  • Write amplification: The amount of data actually written when writing data is greater than the actual amount of data. For example, writing in the LSM tree may trigger a Compact operation, resulting in a much larger amount of data written than the data size of that key.
  • Space amplification: The actual disk space occupied by data is more than the actual size of the data. The redundant storage mentioned above means that for a key, only the latest record is valid, and previous records can be cleaned and recycled.

Size-tiered Strategy#

The size-tiered strategy ensures that the size of each SSTable in each level is similar, while limiting the number of SSTables in each level. As shown in the above figure, each level limits the SSTable to N. When the number of SSTables in each level reaches N, a Compact operation is triggered to merge these SSTables, and the merged result is written to the next level as a larger SSTable.
Size-tiered

It can be seen that when the number of levels reaches a certain quantity, the size of the individual SSTable in the bottom level becomes very large. And the size-tiered strategy will cause significant space amplification. Even for SSTables in the same level, there may be multiple copies of each key's record, and these redundant records are only eliminated when the SSTable in that level performs a compact operation.

Leveled Strategy#

The leveled strategy also adopts a layered approach, with each level limiting the total size of files.
Layered Approach

However, unlike the size-tiered strategy, the leveled strategy divides each level into multiple SSTables of similar size. These SSTables are globally ordered within this level, which means that a key has at most one record in each level, and there are no redundant records.
Globally Ordered SSTable

Compared to the size-tiered strategy, the leveled strategy ensures that there are no duplicate keys within each level. Even in the worst case, except for the bottom level, all other levels have duplicate keys, and the redundancy ratio is very small according to the ratio of adjacent level sizes, which is 10. Therefore, the problem of space amplification is alleviated. However, the problem of write amplification becomes more prominent.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.