Filters approximately store a set of items while trading off accuracy for space-efficiency and can address the limited memory on accelerators, such as GPUs. However, there is a lack of high-performance and feature-rich GPU filters as most advancements in filter research has focused on CPUs. In this paper, we explore the design space of filters with a goal to develop massively parallel, high performance, and feature rich filters for GPUs. We evaluate various filter designs in terms of performance, usability, and supported features and identify two filter designs that offer the right trade off in terms of performance, features, and usability. We present two new GPU-based filters, the TCF and GQF, that can be employed in various high performance data analytics applications. The TCF is a set membership filter and supports faster inserts and queries, whereas the GQF supports counting which comes at an additional performance cost. Both the GQF and TCF provide point and bulk insertion API and are designed to exploit the massive parallelism in the GPU without sacrificing usability and necessary features. The TCF and GQF are up to $4.4\times$ and $1.4\times$ faster than the previous GPU filters in our benchmarks and at the same time overcome the fundamental constraints in performance and usability in current GPU filters.
翻译:过滤器过滤器大致存储一系列物品,同时交换空间效率的准确性,可以处理加速器(如GPU)的有限记忆,但缺乏高性能和高性能的GPU过滤器,因为过滤器研究的大多数进展都集中在CPU上。在本文件中,我们探索过滤器的设计空间,目的是为GPU开发大规模平行、高性能和具有丰富性能的过滤器。我们从性能、可用性能和辅助特性的角度评价各种过滤器设计,确定两种在性能、性能和可用性方面进行适当交易的过滤器设计。我们提出了两种基于GPU的新的过滤器,即TCF和GQF,这些过滤器可用于各种高性能数据分析应用。TCFP是一个固定的成员过滤器,支持更快的插入和查询,而GQF则支持以额外的性能成本进行计数。GQF和TCF提供点和大量插入API,并设计在不牺牲我们当前和必要特性的情况下利用GPU值的大规模平行性交易。TF和GQF标准是比GCF在以前的GPUCMD和GF标准更快的克服时间为4。