Proposed for rapid document similarity estimation in web search engines, the celebrated property of minwise independence imposes highly symmetric constraints on a family $\mathcal{F}$ of permutations of $\{1,\ldots, n\}$: The property is fulfilled by $\mathcal{F}$ if for each $j\in \{1,\ldots,n\}$, any cardinality-$j$ subset $X\subseteq \{1,\ldots,n\}$, and any fixed element $x^\ast\in X$, it occurs with probability $1/j$ that a randomly drawn permutation $\pi$ from $\mathcal{F}$ satisfies $\pi(x^\ast)=\min \{\pi(x) : x\in X\}$. The central interest is to find a family with fewest possible members meeting the stated constraints. We provide a framework that, firstly, is realized as a pure SAT model and, secondly, generalizes a heuristic of Mathon and van Trung to the search of these families. Originally, the latter enforces an underlying group-theoretic decomposition to achieve a significant speed-up for the computer-aided search of structures which can be identified with so-called rankwise independent families. We observe that this approach is suitable to find provenly optimal new representatives of minwise independent families while yielding a decisive speed-up, too. As the problem has a naive search space of size at least $(n!)^n$, we also carefully address symmetry breaking. Finally, we add a bijective proof for a problem encountered by Bargachev when deriving a lower bound on the number of members in a minimal rankwise independent family.
翻译:暂无翻译