Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.
翻译:欧几里得几何中的自动定理证明,尤其是针对国际数学奥林匹克(IMO)级别的问题,仍然是人工智能领域的一项重大挑战和重要研究方向。本文提出了一种高效的几何定理证明方法,该方法完全在CPU上运行,无需依赖基于神经网络的推理。我们的初步研究表明,采用简单的随机添加辅助点策略即可在IMO问题上达到银牌级别的人类表现。在此基础上,我们提出了HAGeo——一种基于启发式的几何演绎辅助构造方法,该方法在IMO-30基准测试的30个问题中解决了28个,达到了金牌级别的性能,并以显著优势超越了基于神经网络的竞争方法AlphaGeometry。为了更全面地评估我们的方法及现有方法,我们进一步构建了HAGeo-409基准,该基准包含409个具有人工评估难度等级的几何问题。与广泛使用的IMO-30相比,我们的基准提出了更大的挑战,提供了更精确的评估,为几何定理证明设定了更高的标准。