This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model~(PLM) for effectively understanding and representing mathematical problems. Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement. Typically, it requires complex mathematical logic and background knowledge for solving mathematical problems. Considering the complex nature of mathematical texts, we design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses. Specially, we first perform token-level pre-training based on a position-biased masking strategy, and then design logic-based pre-training tasks that aim to recover the shuffled sentences and formulas, respectively. Finally, we introduce a more difficult pre-training task that enforces the PLM to detect and correct the errors in its generated solutions. We conduct extensive experiments on offline evaluation (including nine math-related tasks) and online $A/B$ test. Experimental results demonstrate the effectiveness of our approach compared with a number of competitive baselines. Our code is available at: \textcolor{blue}{\url{https://github.com/RUCAIBox/JiuZhang}}.
翻译:本文旨在通过介绍第一个中国数学预培训语言模型~(PLM)来提高机器的数学智能,以有效理解和代表数学问题。与其他标准NLP任务不同,数学文本很难理解,因为它们分别涉及数学术语、符号和公式,通常需要复杂的数学逻辑和背景知识才能解决数学问题。考虑到数学文本的复杂性,我们设计了一个新的课程预培训方法,以改进数学前语言模型的学习,包括基础课程和高级课程。特别是,我们首先根据定位偏差掩码战略进行象征性的预培训,然后设计基于逻辑的预培训任务,分别旨在恢复被打乱的句子和公式。最后,我们引入了一个更困难的培训前任务,即执行PLM,以发现和纠正生成的解决方案中的错误。我们进行了广泛的离线评估实验(包括九项数学相关任务)和在线美元/B$A/B$测试。实验结果显示我们的方法与一些竞争性基线的实效。我们的代码可在以下查阅:\textcolgroqabru{Jhururmas@burmas@buras.