This paper considers the basic question of how efficiently can a constant-time hash table, storing $n$ $\Theta(\log n)$-bit key/value pairs, make use of its random bits? That is, how many random bits does a hash table need to offer constant-time operations with probability $1 - 1 / \poly(n)$? And, if the number of random bits is unrestricted, then what is the highest-probability guarantee that a hash table can offer? Past work on these questions has been bottlenecked by limitations of the known families of hash functions. The hash tables that achieve failure probabilities $1 / \poly(n)$ use at least $\tilde{\Omega}(\log^2 n)$ random bits, which is the number of random bits needed to create hash functions with $\tilde{\Omega}(\log n)$-wise independence. And the only hash tables to achieve failure probabilities less than $1 / 2^{\polylog n}$ require access to fully-random hash functions -- if the same hash tables are implemented using the known explicit families of hash functions, their failure probabilities become $1 / \poly(n)$. To get around these obstacles, we show how to construct a randomized data structure that has the same guarantees as a hash table, but that \emph{avoids the direct use of hash functions}. Building on this, we are then able to give nearly optimal solutions to both problems described above: we construct a hash table using $\tilde{O}(\log n)$ random bits that achieves failure-probability $1 / \poly(n)$; and we construct a hash table using $O(n)$ random bits that achieves failure probability $1 / n^{n^{1 - \epsilon}}$ for an arbitrary positive constant $\epsilon$. Finally, if the keys/values are $(1 + \Theta(1)) \log n$ bits each, then we show that the above guarantees can even be achieved by \emph{succinct dictionaries}, that is, by dictionaries that use space within a $1 + o(1)$ factor of the information-theoretic optimum.
翻译:本文考虑一个常数 hash 表格能提供何种效率的基本问题 { a hash 表格能提供何种最高概率的保证? 过去关于这些问题的工作因已知的 Hash 函数家族的局限性而受阻 。 达到 $/ polly (n) 的失败概率 $ /\\\\\ poly (n) 的表格需要多少随机比特才能提供概率为 1 - 1 /\\\ ph (n) 的常数操作操作? 如果随机位数不受限制, 那么按 hash 表格可以提供什么样的最高概率的保证? (log n) 过去关于这些问题的工作已经被已知的 Hash 函数的 Order 问题所制约 。 达到 $ 2 美元 的 Orral\ pol- poly 的表格几乎每 $\ polly (n) 需要至少 $\\ plax 的默认操作 。