Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense, and factual knowledge. One form of knowledge that has not been studied yet in this context is information about the scalar magnitudes of objects. We show that pretrained language models capture a significant amount of this information but are short of the capability required for general common-sense reasoning. We identify contextual information in pre-training and numeracy as two key factors affecting their performance and show that a simple method of canonicalizing numbers can have a significant effect on the results.
翻译:预先培训的语言模型(LMS)被证明具有重要的语言、常识和事实知识,在这方面尚未研究的一种知识形式是有关天体的量级的信息。我们表明,预先培训的语言模型捕捉了大量这种信息,但缺乏一般常识推理所需的能力。我们把培训前和算术中的背景信息确定为影响其表现的两个关键因素,并表明简单化数字的方法可以对结果产生重大影响。