The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks. Currently, most NLP models treat numbers in text in the same way as other tokens---they embed them as distributed vectors. Is this enough to capture numeracy? We begin by investigating the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset. We find this model excels on questions that require numerical reasoning, i.e., it already captures numeracy. To understand how this capability emerges, we probe token embedding methods (e.g., BERT, GloVe) on synthetic list maximum, number decoding, and addition tasks. A surprising degree of numeracy is naturally present in standard embeddings. For example, GloVe and word2vec accurately encode magnitude for numbers up to 1,000. Furthermore, character-level embeddings are even more precise---ELMo captures numeracy the best for all pre-trained methods---but BERT, which uses sub-word units, is less exact.
翻译:对于许多复杂的推理任务来说,理解数字并使用数字(数字)的能力(数字)是关键。目前,大多数NLP模型将数字与其他标记一样以文本方式处理,它们将它们嵌入为分布矢量。这是否足以捕捉算算数?我们首先调查DROP数据集中最先进的回答问题模型的数字推理能力。我们发现这个模型在需要数字推理的问题上非常出色,即它已经捕捉了算数。为了了解这种能力是如何产生的,我们发现在合成清单的最大、数字解码和添加任务中存在象征性嵌入方法(例如,BERT,GloVe)。在标准嵌入中自然存在惊人的算数程度。例如,GloVe和Word2vec精确的编码数量到1,000。此外,字符级嵌入对于所有预先训练的方法-但使用小词单位的BERT最精确的计算方法-但BERT是比较不精确的。