We explore the use of expert iteration in the context of language modeling applied to formal mathematics. We show that at same compute budget, expert iteration, by which we mean proof search interleaved with learning, dramatically outperforms proof search only. We also observe that when applied to a collection of formal statements of sufficiently varied difficulty, expert iteration is capable of finding and solving a curriculum of increasingly difficult problems, without the need for associated ground-truth proofs. Finally, by applying this expert iteration to a manually curated set of problem statements, we achieve state-of-the-art on the miniF2F benchmark, automatically solving multiple challenging problems drawn from high school olympiads.
翻译:我们探索了在适用于正规数学的语言模型中使用专家迭代的方法。我们表明,在计算预算时,专家迭代(我们指在学习中相互交错的校对搜索),仅能大大优于校对搜索。我们还注意到,当应用于一系列困难程度各异的正式报表时,专家迭代能够找到和解决一个越来越困难的课程,而不需要相关的地面真实证据。最后,通过将该专家迭代应用到一组手工拼凑的问题说明中,我们实现了在微型F2F基准上最先进的技术,自动解决从高中奥林匹亚学派学来的多重挑战性问题。