We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries. As a first core idea, bloomRF introduces novel prefix hashing to efficiently encode range information in the hash-code of the key itself. As a second key concept, bloomRF proposes novel piecewise-monotone hash-functions that preserve local order and support fast range-lookups with fewer memory accesses. bloomRF has near-optimal space complexity and constant query complexity. Although, bloomRF is designed for integer domains, it supports floating-points, and can serve as a multi-attribute filter. The evaluation in RocksDB and in a standalone library shows that it is more efficient and outperforms existing point-range-filters by up to 4x across a range of settings and distributions, while keeping the false-positive rate low.
翻译:我们引入了“开花RF ”, 作为支持点和范围查询的近似会籍测试的统一方法。 作为第一个核心理念,“开花RF ” 引入了新的前缀散列(hash), 以便有效地在键本身的散列编码中编码范围信息。作为第二个关键理念,“开花RF ” 提出新的小节-monotoone散列功能,以维护本地秩序,支持有较少内存访问的快速场景。“开花RF ” 拥有近乎最佳的空间复杂性和常态查询复杂性。尽管“开花RF ” 是为整数域设计的,但它支持浮动点,并且可以起到多属性过滤器的作用。 在“洛克斯登数据库”和独立图书馆中进行的评估表明,它更高效,在各种设置和分布上比现有的点距离过滤器高出4x,同时将假正率保持低。