Modern CDCL SAT solvers easily solve industrial instances containing tens of millions of variables and clauses, despite the theoretical intractability of the SAT problem. This gap between practice and theory is a central problem in solver research. It is believed that SAT solvers exploit structure inherent in industrial instances, and hence there have been numerous attempts over the last 25 years at characterizing this structure via parameters. These can be classified as rigorous, i.e., they serve as a basis for complexity-theoretic upper bounds (e.g., backdoors), or correlative, i.e., they correlate well with solver run time and are observed in industrial instances (e.g., community structure). Unfortunately, no parameter proposed to date has been shown to be both strongly correlative and rigorous over a large fraction of industrial instances. Given the sheer difficulty of the problem, we aim for an intermediate goal of proposing a set of parameters that is strongly correlative and has good theoretical properties. Specifically, we propose parameters based on a graph partitioning called Hierarchical Community Structure (HCS), which captures the recursive community structure of a graph of a Boolean formula. We show that HCS parameters are strongly correlative with solver run time using an Empirical Hardness Model, and further build a classifier based on HCS parameters that distinguishes between easy industrial and hard random/crafted instances with very high accuracy. We further strengthen our hypotheses via scaling studies. On the theoretical side, we show that counterexamples which plagued community structure do not apply to HCS, and that there is a subset of HCS parameters such that restricting them limits the size of embeddable expanders.
翻译:现代 CDCL SAT 解决方案很容易解决包含数千万变量和条款的工业实例,尽管SAT 问题在理论上具有吸引力。实践和理论之间的这一差距是求解者研究的一个中心问题。据信,SAT 解决方案利用了工业实例中固有的结构,因此在过去25年中曾多次尝试通过参数来描述这一结构。这些可以归类为严格,即它们作为复杂理论上界(如后门)或相关界(即它们与求解者运行时间密切相关,并在工业实例(如社区结构)中观测到。不幸的是,至今提出的参数没有显示出在大量工业实例中具有很强关联性和严格性。鉴于这一问题的难度,我们的目标是提出一套具有高度关联性和良好理论属性的参数。具体地说,我们根据一个叫“高层次社区结构”(HCS ) 的图形分解法提出参数,该参数可以捕捉到可追溯性社区(cSB) 的循环社区结构,而该模型/CS 的精确度则能进一步显示我们以高的CS 模型和CS 的精确度模型化模型模型模型模型显示, 的精确度的精确度的模型显示,从而可以进一步显示我们更精确的CSBL 的CS 的CS 的精确度的模型的精确度结构。