Existing generative pre-trained language models (e.g., GPT) focus on modeling the language structure and semantics of general texts. However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e.g., math word problems and measurement estimation). In this paper, we propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts. Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number. A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT. We conduct extensive experiments on four different datasets to evaluate the numeracy ability of NumGPT. The experiment results show that NumGPT outperforms baseline models (e.g., GPT and GPT with DICE) on a range of numerical reasoning tasks such as measurement estimation, number comparison, math word problems, and magnitude classification. Ablation studies are also conducted to evaluate the impact of pre-training and model hyperparameters on the performance.
翻译:在本文中,我们提议NumGPT, 这是一种基因化的预培训前的模型,明确模拟文本中数字的数值属性。具体地说,它利用一个原型基于数字的嵌入点来编码数字的曼蒂萨,并用个人嵌入来编码数字的缩略图。一个数字认知损失功能旨在将数字纳入NumGPT的培训前目标中。我们还对四个不同的数据集进行了广泛的实验,以评估NumGPT的算术能力。实验结果显示,NumGPT在一系列数字推理任务方面,如计量估计、数字比较、数学词问题和规模分类,比基准模型(例如,GPT和GPT与DICE)更符合基准模型。