Traditional automated theorem provers for first-order logic depend on speed-optimized search and many handcrafted heuristics that are designed to work best over a wide range of domains. Machine learning approaches in literature either depend on these traditional provers to bootstrap themselves or fall short on reaching comparable performance. In this paper, we propose a general incremental learning algorithm for training domain specific provers for first-order logic without equality, based only on a basic given-clause algorithm, but using a learned clause-scoring function. Clauses are represented as graphs and presented to transformer networks with spectral features. To address the sparsity and the initial lack of training data as well as the lack of a natural curriculum, we adapt hindsight experience replay to theorem proving, so as to be able to learn even when no proof can be found. We show that provers trained this way can match and sometimes surpass state-of-the-art traditional provers on the TPTP dataset in terms of both quantity and quality of the proofs.
翻译:第一阶逻辑的传统自动理论验证标准取决于速度优化搜索和许多手工制作的超自然理论,这些理论设计设计在广泛的领域最有效。文学中的机器学习方法要么依靠这些传统验证者来控制自己,要么没有达到可比较的性能。在本文中,我们提出一种通用的渐进学习算法,用于培训特定域验证者,在没有平等的情况下进行第一阶逻辑的域验证,这种算法仅以基本的给定格算法为基础,但使用一个有学识的条款比对功能。条款以图表形式出现,并提交给具有光谱特征的变压器网络。为了解决不稳定性、最初缺乏培训数据以及缺乏自然课程的问题,我们调整后视经验,使其与标语相匹配,以便即使在找不到证据时也能够学习。我们证明,在TPTP数据集的数量和质量方面,经过培训的验证人可以匹配,有时甚至超过最先进的传统验证人。