Many contextualized word representations are now learned by intricate neural network models, such as masked neural language models (MNLMs) which are made up of huge neural network structures and trained to restore the masked text. Such representations demonstrate superhuman performance in some reading comprehension (RC) tasks which extract a proper answer in the context given a question. However, identifying the detailed knowledge trained in MNLMs is challenging owing to numerous and intermingled model parameters. This paper provides new insights and empirical analyses on commonsense knowledge included in pretrained MNLMs. First, we use a diagnostic test that evaluates whether commonsense knowledge is properly trained in MNLMs. We observe that a large proportion of commonsense knowledge is not appropriately trained in MNLMs and MNLMs do not often understand the semantic meaning of relations accurately. In addition, we find that the MNLM-based RC models are still vulnerable to semantic variations that require commonsense knowledge. Finally, we discover the fundamental reason why some knowledge is not trained. We further suggest that utilizing an external commonsense knowledge repository can be an effective solution. We exemplify the possibility to overcome the limitations of the MNLM-based RC models by enriching text with the required knowledge from an external commonsense knowledge repository in controlled experiments.
翻译:由庞大神经网络结构组成并受过修复遮蔽文字培训的复杂神经神经语言模型(MNLMS)等神经网络模型,现已从中了解到许多背景化的字义表述方法,这些模型由庞大神经网络结构组成,并经过培训,以恢复遮蔽文字。这些模型在某些阅读理解(RC)任务中表现出超人性的表现,在遇到一个问题的情况下得到了适当的答案。然而,由于模型参数众多且相互交错,确定在MNLMM中受过培训的详细知识是富有挑战性的。本文提供了对预先训练的MNLMMM中所包含的普通知识知识的新见解和经验分析。首先,我们使用诊断测试,评估普通知识是否在MNLMM中得到适当培训。我们发现,大量普通知识库的知识库在MNLMMM和MMM中没有得到适当培训。我们发现,利用外部知识库的外部知识库可以有效地克服外部知识的极限。我们通过外部的实验,将共同知识库中的共同知识库用于不断更新。