Feature extraction is a fundamental task in the application of machine learning methods to SAT solving. It is used in algorithm selection and configuration for solver portfolios and satisfiability classification. Many approaches have been proposed to extract meaningful attributes from CNF instances. Most of them lack a working/updated implementation, and the limited descriptions lack clarity affecting the reproducibility. Furthermore, the literature misses a comparison among the features. This paper introduces SATfeatPy, a library that offers feature extraction techniques for SAT problems in the CNF form. This package offers the implementation of all the structural and statistical features from there major papers in the field. The library is provided in an up-to-date, easy-to-use Python package alongside a detailed feature description. We show the high accuracy of SAT/UNSAT and problem category classification, using five sets of features generated using our library from a dataset of 3000 SAT and UNSAT instances, over ten different classes of problems. Finally, we compare the usefulness of the features and importance for predicting a SAT instance's original structure in an ablation study.
翻译:在应用机器学习方法解决沙特德士古公司问题方面,地物提取是一项基本任务,用于求解器组合和可比较性分类的算法选择和配置。提出了许多办法,从CNF案例中提取有意义的属性,其中多数缺乏操作/更新实施,描述不够清楚,影响可复制性;此外,文献没有对这些特征进行比较。本文介绍了SATfeatPy,这是一个图书馆,为CNF形式中的SAT问题提供特质提取技术。这个软件包提供了实地主要文件的所有结构和统计特征的落实。图书馆以最新的、易于使用的Python软件包提供,并附有详细的特征描述。我们用3000SAT/UNSAT和问题分类的5组特征展示了SAT/UNSAT和问题分类的高度准确性,使用我们的图书馆生成的3000SAT和UNSAT实例的5组特征,涉及10种不同的问题。最后,我们比较了这些特征的实用性和重要性,以预测沙特德士古公司的原始结构在通缩缩略图研究中的重要性。