We introduce a machine-learning-based tool for the Lean proof assistant that suggests relevant premises for theorems being proved by a user. The design principles for the tool are (1) tight integration with the proof assistant, (2) ease of use and installation, (3) a lightweight and fast approach. For this purpose, we designed a custom version of the random forest model, trained in an online fashion. It is implemented directly in Lean, which was possible thanks to the rich and efficient metaprogramming features of Lean 4. The random forest is trained on data extracted from mathlib -- Lean's mathematics library. We experiment with various options for producing training features and labels. The advice from a trained model is accessible to the user via the suggest_premises tactic which can be called in an editor while constructing a proof interactively.
翻译:我们提出了一种基于机器学习的工具,用于Lean证明助手,为用户证明的定理提供相关的前提。该工具的设计原则是(1)与证明助手紧密集成,(2)易于使用和安装,(3)轻量级且快速。为此,我们设计了一个定制版的随机森林模型,并通过在线方式进行训练。它直接在Lean中实现,这得益于Lean 4丰富而高效的元编程功能。我们是从Lean的数学库中提取数据来训练随机森林。我们尝试了多种选项来生成训练特征和标签。通过suggest_premises策略,训练模型的建议可以在编辑器中以交互方式构建证明时访问到。