In classical Linear Hashing $\mathsf{LH}$ items $x\in \{1,2,\ldots, |U|\}$ are mapped to bins $\{0,1,\ldots, n-1\}$ by a function such as $$x\mapsto (ax+b)\mod p \mod n$$ for prime $p\in [|U|, 2|U|]$ and randomly chosen integers $a,b \in [1,p]$. Despite $\mathsf{LH}$'s simplicity understanding the expected maxload, i.e., number of elements in a fullest bin, of $\mathsf{LH}$ for worst-case inputs is a notoriously challenging open question. For hashing $n$ items the best known lower bound is $\Omega\left(\frac{\log n}{\log\log n}\right)$, whereas the best known upper bound is $\widetilde{O}(n^{1/3})$ due to Knudsen. In this paper we consider three modifications of classic $\mathsf{LH}$: (1) $\mathsf{LH}$ without the ``$+b$" shift term, resulting in loss of pairwise-independence. (2) $\mathsf{LH}$ with a composite, rather than prime, modulus. (3) $\mathsf{LH}$ in a continuous setting where the multiplier ``$a$" is chosen from $\mathbb{R}$ rather than $\mathbb{Z}$. We show that $\mathsf{LH}$ is fairly robust to these changes, in particular by demonstrating analogs of known maxload-bounds for these new variants. These results give several new perspectives on $\mathsf{LH}$, in particular showing that properties of $\mathsf{LH}$ such as pairwise-independence, a prime modulus, or even its setting in the integers may not be fundamental. We believe that these new perspectives, beyond being independently interesting, may also be useful in future work towards understanding the maxload of $\mathsf{LH}$.
翻译:暂无翻译