Bloom Filters are a fundamental and pervasive data structure. Within the growing area of Learned Data Structures, several Learned versions of Bloom Filters have been considered, yielding advantages over classic Filters. Each of them uses a classifier, which is the Learned part of the data structure. Although it has a central role in those new filters, and its space footprint as well as classification time may affect the performance of the Learned Filter, no systematic study of which specific classifier to use in which circumstances is available. We report progress in this area here, providing also initial guidelines on which classifier to choose among five classic classification paradigms.
翻译:Bloom 过滤器是一个基本和普遍的数据结构。 在不断增长的数据结构领域,已经考虑了若干版本的Bloom 过滤器的学术版本,这些版本比经典过滤器具有优势。每种版本都使用一个分类器,这是数据结构的总结部分。虽然它在这些新的过滤器中具有中心作用,但其空间足迹和分类时间可能会影响“clear 过滤器”的性能,但没有系统研究在哪些情况下可以使用特定的分类器。我们在此报告这方面的进展,同时提供初步指南,供分类器在五个典型分类模式中选择。