Bit vectors are fundamental building blocks of many succinct data structures. They can be used to represent graphs, are an important part of many text indices in the form of the wavelet tree, and can be used to encode ordered sequences of integers as Elias-Fano codes. To do so, two queries have to be answered: namely rank and select queries. Given a position in the bit vector, a rank query returns the number of 1-bits before that position. A select query, given a parameter $j$, returns the position of the $j$-th 1-bit. On a length-n bit vector, both queries can be answered in $O(1)$ time and require $o(n)$ bits of additional space. In practice, the smallest (uncompressed) rank and select data structure cs-poppy has a space overhead of $\approx$ 3.51% [Zhou et al., SEA 13]. In this paper, we present an improved rank and select data structure that has the same space overhead but can answer queries up to 8% (rank) and 16.5% (select) faster compared with cs-poppy.
翻译:位元矢量是许多简明数据结构的基本构件。 位值矢量可以用来代表图表, 是波盘树形式的许多文本指数的重要部分, 也可以用来将整数的顺序编码成埃利亚斯- 法诺代码。 要这样做, 需要回答两个查询: 级别和选择查询。 在位盘中的位置, 排序查询返回位置之前的 1位位数。 选中查询, 给一个参数$j, 返回第 1位数的位数。 在长位矢量上, 两种查询都可以以 $(1) 的时间来回答, 并且需要 $( n) 位数的额外空间。 实际上, 最小的( 未压缩) 等级和选择的数据结构 c-poppy 的位位数为 $\ approx$ 3. 5% [Zhou et al., SEA 13] 。 在本文中, 我们展示了一个改进的级别和选择的数据结构, 具有相同的空间管理器位数, 但可以回答查询速度达到 8% (rik) 和 16.5 (roppy) 。