This paper considers the basic question of how strong of a probabilistic guarantee can a hash table, storing $n$ $(1 + \Theta(1)) \log n$-bit key/value pairs, offer? Past work on this question has been bottlenecked by limitations of the known families of hash functions: The only hash tables to achieve failure probabilities less than $1 / 2^{\polylog n}$ require access to fully-random hash functions -- if the same hash tables are implemented using the known explicit families of hash functions, their failure probabilities become $1 / \poly(n)$. To get around these obstacles, we show how to construct a randomized data structure that has the same guarantees as a hash table, but that \emph{avoids the direct use of hash functions}. Building on this, we are able to construct a hash table using $O(n)$ random bits that achieves failure probability $1 / n^{n^{1 - \epsilon}}$ for an arbitrary positive constant $\epsilon$. In fact, we show that this guarantee can even be achieved by a \emph{succinct dictionary}, that is, by a dictionary that uses space within a $1 + o(1)$ factor of the information-theoretic optimum. Finally we also construct a succinct hash table whose probabilistic guarantees fall on a different extreme, offering a failure probability of $1 / \poly(n)$ while using only $\tilde{O}(\log n)$ random bits. This latter result matches (up to low-order terms) a guarantee previously achieved by Dietzfelbinger et al., but with increased space efficiency and with several surprising technical components.
翻译:本文思考了一个基本问题, 概率保证的强度如何? 如果使用已知的散列函数直截了当的组合执行相同的散列表, 其失败概率将变成1美元/ poly( 美元) / poly( 美元) / log n- bit 键/ value 。 要绕过这些障碍, 我们展示了如何构建一个随机化的数据结构, 该结构与已知的散列函数家族有相同的保证 : 唯一能够实现失败概率低于1美元/ 2 ⁇ ( polylog n) 的散列表需要完全随机散列功能。 在此基础上, 我们可以用已知的散列函数直径( 美元) 执行相同的散列表格, 其失败概率将变成1美元/ poolly( 美元) 。 它们的概率将使用一个任意的恒定的 美元/ oepsi( 美元) 元( 美元) 。