远程测距过滤器:实用适应过滤器 (Telescoping Filter: A Practical Adaptive Filter)

Filters are fast, small and approximate set membership data structures. They are often used to filter out expensive accesses to a remote set S for negative queries (that is, a query x not in S). Filters have one-sided errors: on a negative query, a filter may say "present" with a tunable false-positve probability of epsilon. Correctness is traded for space: filters only use log (1/\epsilon) + O(1) bits per element. The false-positive guarantees of most filters, however, hold only for a single query. In particular, if x is a false positive of a filter, a subsequent query to x is a false positive with probability 1, not epsilon. With this in mind, recent work has introduced the notion of an adaptive filter. A filter is adaptive if each query has false positive epsilon, regardless of what queries were made in the past. This requires "fixing" false positives as they occur. Adaptive filters not only provide strong false positive guarantees in adversarial environments but also improve performance on query practical workloads by eliminating repeated false positives. Existing work on adaptive filters falls into two categories. First, there are practical filters based on cuckoo filters that attempt to fix false positives heuristically, without meeting the adaptivity guarantee. Meanwhile, the broom filter is a very complex adaptive filter that meets the optimal theoretical bounds. In this paper, we bridge this gap by designing a practical, provably adaptive filter: the telescoping adaptive filter. We provide theoretical false-positive and space guarantees of our filter, along with empirical results where we compare its false positive performance against state-of-the-art filters. We also test the throughput of our filters, showing that they achieve comparable performance to similar non-adaptive filters.

翻译：过滤器是快速、小型和近似地设定会籍数据结构。它们通常用来过滤过滤为负面查询( 即查询x 不在 S) 的远程 S 的昂贵访问。过滤器有片面错误错误: 在否定的查询中, 过滤器可能说“ 存在 ”, 并有一个可调适的 Epsilon 概率。校正为空间交易: 过滤器只使用日志( 1/\ epsilon) + O(1) 比特。大多数过滤器的错误和阳性保证只用于单次查询。特别是, 如果 x 是过滤器的错误的过滤器, 则随后的调适度查询是错的, 概率为1, 而不是epslelon。如此, 最近的工作引入了调适度过滤器的概念。如果每次调适值为假的 Epslon, 这需要“ 固定” 假的逆差值。然而, 适应过滤器不仅在对敌对的环境下提供强烈的适应性保证,, 也通过设计不实性判断性保证。改进性判断性工作, 改进性保证我们的工作表现到实际的处理工作, 。, 也通过不力改进工作到改进性工作, 改进性工作, 改进工作到改进性工作到工作到,, 进行不重复性,, 进行不重复性保证, 进行不重复性保证, 进行进行。