JiuZhang:中国学前语言模型,用于了解数学问题 (JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding) - 专知论文

会员服务 ·

0

可理解性 · 语言模型化 · 数学 · Shuffle · Extensibility ·

2022 年 6 月 13 日

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

翻译：JiuZhang:中国学前语言模型,用于了解数学问题

Wayne Xin Zhao,Kun Zhou,Zheng Gong,Beichen Zhang,Yuanhang Zhou,Jing Sha,Zhigang Chen,Shijin Wang,Cong Liu,Ji-Rong Wen

from arxiv, 11 pages, Accepted by KDD 2022

This paper aims to advance the mathematical intelligence of machines by presenting the first Chinese mathematical pre-trained language model~(PLM) for effectively understanding and representing mathematical problems. Unlike other standard NLP tasks, mathematical texts are difficult to understand, since they involve mathematical terminology, symbols and formulas in the problem statement. Typically, it requires complex mathematical logic and background knowledge for solving mathematical problems. Considering the complex nature of mathematical texts, we design a novel curriculum pre-training approach for improving the learning of mathematical PLMs, consisting of both basic and advanced courses. Specially, we first perform token-level pre-training based on a position-biased masking strategy, and then design logic-based pre-training tasks that aim to recover the shuffled sentences and formulas, respectively. Finally, we introduce a more difficult pre-training task that enforces the PLM to detect and correct the errors in its generated solutions. We conduct extensive experiments on offline evaluation (including nine math-related tasks) and online $A/B$ test. Experimental results demonstrate the effectiveness of our approach compared with a number of competitive baselines. Our code is available at: \textcolor{blue}{\url{https://github.com/RUCAIBox/JiuZhang}}.

翻译：本文旨在通过介绍第一个中国数学预培训语言模型~(PLM)来提高机器的数学智能,以有效理解和代表数学问题。与其他标准NLP任务不同,数学文本很难理解,因为它们分别涉及数学术语、符号和公式,通常需要复杂的数学逻辑和背景知识才能解决数学问题。考虑到数学文本的复杂性,我们设计了一个新的课程预培训方法,以改进数学前语言模型的学习,包括基础课程和高级课程。特别是,我们首先根据定位偏差掩码战略进行象征性的预培训,然后设计基于逻辑的预培训任务,分别旨在恢复被打乱的句子和公式。最后,我们引入了一个更困难的培训前任务,即执行PLM,以发现和纠正生成的解决方案中的错误。我们进行了广泛的离线评估实验(包括九项数学相关任务)和在线美元/B$A/B$测试。实验结果显示我们的方法与一些竞争性基线的实效。我们的代码可在以下查阅:\textcolgroqabru{Jhururmas@burmas@buras.

0

相关内容

可理解性

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

128+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

104+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

FeSe铁基超导薄膜的扫描隧道显微学研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土三氢化物高压下的金属-绝缘体相变与超导相变研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-873的下调促进结直肠癌细胞增殖的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于Landau-Zener-Stuckelberg效应的超快电荷量子比特研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于聚合物仿生纳米通道的高灵敏microRNA生物传感器

国家自然科学基金

0+阅读 · 2013年12月31日

新型低氧诱导蛋白WSB-1翻译后调控骨肉瘤化疗敏感性研究

国家自然科学基金

0+阅读 · 2012年12月31日

半Heusler合金型拓扑绝缘体材料的制备和物性研究

国家自然科学基金

0+阅读 · 2011年12月31日

LDA+Guztwiller方法研究铁基超导体

国家自然科学基金

0+阅读 · 2009年12月31日

三维纳米多孔PtBi金属间化合物薄膜的电化学制备及其电催化性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

The Who in Code-Switching: A Case Study for Predicting Egyptian Arabic-English Code-Switching Levels based on Character Profiles

Arxiv

0+阅读 · 2022年7月31日

Learning to Prompt for Vision-Language Models

Arxiv

0+阅读 · 2022年7月30日

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Arxiv

0+阅读 · 2022年7月29日

Language Models Can Teach Themselves to Program Better

Arxiv

0+阅读 · 2022年7月29日

ExSum: From Local Explanations to Model Understanding

Arxiv

13+阅读 · 2022年4月30日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

77+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

128+阅读 · 2020年7月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

104+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

中文版 | 人工智能在兵棋推演中的应用探索

《定位、导航与授时（PNT）未来趋势研究报告》76页最新报告

中文版 | 美空军拟为未来无人机列装机密型AIM-260空对空导弹

《多模态大语言模型在基于模型的系统工程中的视觉问答能力探索》最新报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

相关论文

The Who in Code-Switching: A Case Study for Predicting Egyptian Arabic-English Code-Switching Levels based on Character Profiles

Arxiv

0+阅读 · 2022年7月31日

Learning to Prompt for Vision-Language Models

Arxiv

0+阅读 · 2022年7月30日

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

Arxiv

0+阅读 · 2022年7月29日

Language Models Can Teach Themselves to Program Better

Arxiv

0+阅读 · 2022年7月29日

ExSum: From Local Explanations to Model Understanding

Arxiv

13+阅读 · 2022年4月30日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

16+阅读 · 2019年5月24日

相关基金

FeSe铁基超导薄膜的扫描隧道显微学研究

国家自然科学基金

0+阅读 · 2014年12月31日

稀土三氢化物高压下的金属-绝缘体相变与超导相变研究

国家自然科学基金

0+阅读 · 2014年12月31日

miR-873的下调促进结直肠癌细胞增殖的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于Landau-Zener-Stuckelberg效应的超快电荷量子比特研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于聚合物仿生纳米通道的高灵敏microRNA生物传感器

国家自然科学基金

0+阅读 · 2013年12月31日

新型低氧诱导蛋白WSB-1翻译后调控骨肉瘤化疗敏感性研究

国家自然科学基金

0+阅读 · 2012年12月31日

半Heusler合金型拓扑绝缘体材料的制备和物性研究

国家自然科学基金

0+阅读 · 2011年12月31日

LDA+Guztwiller方法研究铁基超导体

国家自然科学基金

0+阅读 · 2009年12月31日

三维纳米多孔PtBi金属间化合物薄膜的电化学制备及其电催化性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员