We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and feedback from Lean while also generating auxiliary lemmas. These auxiliary lemmas are not limited to subgoals in the formal proof but can also include special cases or potentially useful facts derived from the assumptions, which help in discovering a viable proof strategy. It achieves an 88.1% success rate on the MiniF2F benchmark, establishing a new state-of-the-art among methods using small language models (SLMs) with a much lower sample budget than previous approaches. We also present theoretical analyses and case studies that illustrate how these generated lemmas contribute to solving challenging problems. Our code is publicly available at: https://github.com/kAIto47802/Prover-Agent.
翻译:我们提出了Prover Agent,一种用于自动定理证明的新型人工智能智能体,它将大型语言模型(LLMs)与形式化证明辅助工具Lean相结合。Prover Agent协调一个非形式化推理的LLM、一个形式化证明模型以及来自Lean的反馈,同时生成辅助引理。这些辅助引理不仅限于形式化证明中的子目标,还可以包括从假设中推导出的特殊情形或潜在有用的事实,这有助于发现可行的证明策略。在MiniF2F基准测试中,它实现了88.1%的成功率,在使用小语言模型(SLMs)的方法中确立了新的最先进水平,且所需的样本预算远低于先前方法。我们还提供了理论分析和案例研究,以说明这些生成的引理如何帮助解决具有挑战性的问题。我们的代码公开于:https://github.com/kAIto47802/Prover-Agent。