LSM-tree based key-value (KV) stores organize data in a multi-level structure for high-speed writes. Range queries on traditional LSM-trees must seek and sort-merge data from multiple table files on the fly, which is expensive and often leads to mediocre read performance. To improve range query efficiency on LSM-trees, we introduce a space-efficient KV index data structure, named REMIX, that records a globally sorted view of KV data spanning multiple table files. A range query on multiple REMIX-indexed data files can quickly locate the target key using a binary search, and retrieve subsequent keys in sorted order without key comparisons. We build RemixDB, an LSM-tree based KV-store that adopts a write-efficient compaction strategy and employs REMIXes for fast point and range queries. Experimental results show that REMIXes can substantially improve range query performance in a write-optimized LSM-tree based KV-store.
翻译:以 LSM 树为基础的密钥值( KV) 仓库将数据组织在一个用于高速的多层结构中 写。 传统的 LSM 树上的测距查询必须从苍蝇上的多张表格文件中寻找和排序合并数据, 费用昂贵, 常常导致中等阅读性能 。 为了提高 LSM 树上的测距查询效率, 我们引入了一个名为 REMIX 的空间高效KV 索引数据结构, 记录了覆盖多个表格文件的全局分类数据。 多张 REMIX 索引数据文件的测距查询可以使用二进制搜索快速定位目标密钥, 并在不进行关键比较的情况下以排序顺序检索后续密钥 。 我们建造了 RemixDB, 一个基于 LSMT 的基于 LV- store 的 LSM- Tree, 采用写高效的压缩策略, 并使用 REMIX 用于快速点和测距查询 。 实验结果显示 REMIX 能够大大改进基于写入的精选 LSMM- Tree 。