Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built. This is particularly challenging when applying large Transformer language models to tactic prediction, because the scaling of performance with respect to model size is quickly disrupted in the data-scarce, easily-overfitted regime. We propose PACT ({\bf P}roof {\bf A}rtifact {\bf C}o-{\bf T}raining), a general methodology for extracting abundant self-supervised data from kernel-level proof terms for co-training alongside the usual tactic prediction objective. We apply this methodology to Lean, an interactive proof assistant which hosts some of the most sophisticated formalized mathematics to date. We instrument Lean with a neural theorem prover driven by a Transformer language model and show that PACT improves theorem proving success rate on a held-out suite of test theorems from 32\% to 48\%.
翻译:用于在大型正规数学图书馆中模拟理论学习的标签数据很少,因为这种图书馆需要多年的人类专家集中努力才能建立。在应用大型变换语言模型进行战术预测时,这尤其具有挑战性,因为模型大小的性能规模在数据残缺、易于改装的制度中很快中断。我们提议PACT (prfProof ruof {bf A}rtifact {bf C}o-bf Trainning),这是一种一般性方法,用于在常规战术预测目标的同时,从联合培训的内核级校准术语中提取大量自我监督的数据。我们把这种方法应用到Lean,这是一名互动的验证助理,是迄今为止一些最尖端的正式数学的东道主。我们用一个由变换语言模型驱动的神经定理理论验证器来测量Lean,并显示PACTG将一个由32 ⁇ 至48 ⁇ 的悬置测试标本系列试验标的成功率提高。