The seminal work of Bencz\'ur and Karger demonstrated cut sparsifiers of near-linear size, with several applications throughout theoretical computer science. Subsequent extensions have yielded sparsifiers for hypergraph cuts and more recently linear codes over Abelian groups. A decade ago, Kogan and Krauthgamer asked about the sparsifiability of arbitrary constraint satisfaction problems (CSPs). For this question, a trivial lower bound is the size of a non-redundant CSP instance, which admits, for each constraint, an assignment satisfying only that constraint (so that no constraint can be dropped by the sparsifier). For graph cuts, spanning trees are non-redundant instances. Our main result is that redundant clauses are sufficient for sparsification: for any CSP predicate R, every unweighted instance of CSP(R) has a sparsifier of size at most its non-redundancy (up to polylog factors). For weighted instances, we similarly pin down the sparsifiability to the so-called chain length of the predicate. These results precisely determine the extent to which any CSP can be sparsified. A key technical ingredient in our work is a novel application of the entropy method from Gilmer's recent breakthrough on the union-closed sets conjecture. As an immediate consequence of our main theorem, a number of results in the non-redundancy literature immediately extend to CSP sparsification. We also contribute new techniques for understanding the non-redundancy of CSP predicates. In particular, we give an explicit family of predicates whose non-redundancy roughly corresponds to the structure of matching vector families in coding theory. By adapting methods from the matching vector codes literature, we are able to construct an explicit predicate whose non-redundancy lies between $\Omega(n^{1.5})$ and $\widetilde{O}(n^{1.6})$, the first example with a provably non-integral exponent.
翻译:暂无翻译