BERT 不是“计数 : 学习将数学语句与证明匹配” (BERT is not The Count: Learning to Match Mathematical Statements with Proofs)

from arxiv, Accepted to the Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023; 14 pages. arXiv admin note: substantial text overlap with arXiv:2102.02110

We introduce a task consisting in matching a proof to a given mathematical statement. The task fits well within current research on Mathematical Information Retrieval and, more generally, mathematical article analysis (Mathematical Sciences, 2014). We present a dataset for the task (the MATcH dataset) consisting of over 180k statement-proof pairs extracted from modern mathematical research articles. We find this dataset highly representative of our task, as it consists of relatively new findings useful to mathematicians. We propose a bilinear similarity model and two decoding methods to match statements to proofs effectively. While the first decoding method matches a proof to a statement without being aware of other statements or proofs, the second method treats the task as a global matching problem. Through a symbol replacement procedure, we analyze the "insights" that pre-trained language models have in such mathematical article analysis and show that while these models perform well on this task with the best performing mean reciprocal rank of 73.7, they follow a relatively shallow symbolic analysis and matching to achieve that performance.

翻译：我们引入了一个任务, 包括将证据与给定数学语句相匹配。任务非常适合当前关于数学信息检索和一般数学文章分析的研究( 数学科学, 2014) 。我们为任务提供了一套数据集( MATCH 数据集), 由现代数学研究文章中提取的180k 防对称配对组成。我们发现这个数据集非常能代表我们的任务, 因为它包含对数学家有用的相对较新的发现。我们提出了一个双线相似模型和两种解码方法, 以有效匹配语句和证据。虽然第一个解码方法将证据与声明匹配, 但没有意识到其他语句或证据, 但第二个方法将任务视为一个全球匹配问题。我们通过一个符号替换程序, 分析经过预先训练的语言模型在数学文章分析中具有的“ 透视”, 并显示这些模型在这项工作上表现良好, 其最佳表现为平均对等等级为73.7, 它们遵循相对浅的符号分析和匹配, 以达到这一效果。