语言模型是否具有可普遍接受的共同推断力? (Do Language Models Perform Generalizable Commonsense Inference?)

Inspired by evidence that pretrained language models (LMs) encode commonsense knowledge, recent work has applied LMs to automatically populate commonsense knowledge graphs (CKGs). However, there is a lack of understanding on their generalization to multiple CKGs, unseen relations, and novel entities. This paper analyzes the ability of LMs to perform generalizable commonsense inference, in terms of knowledge capacity, transferability, and induction. Our experiments with these three aspects show that: (1) LMs can adapt to different schemas defined by multiple CKGs but fail to reuse the knowledge to generalize to new relations. (2) Adapted LMs generalize well to unseen subjects, but less so on novel objects. Future work should investigate how to improve the transferability and induction of commonsense mining from LMs.

翻译：在经过培训的语言模型(LMS)对常识知识进行编码的证据的启发下,最近的工作运用LMS自动地将常识知识图(CKGs)填充为普通知识图(CKGs),然而,对于这些模型对多个CKGs、隐形关系和新实体的概括性缺乏了解。本文分析了LMs在知识能力、可转移性和上岗方面进行一般常识推断的能力。我们在这三个方面的实验表明:(1) LMs可以适应多个CKGs定义的不同化学模型,但未能再利用这些知识来概括新的关系。(2) 适应LMs对隐性主题的概括性很好,但对新目标的概括性较少。未来的工作应当研究如何改进普通采矿的可转移性和诱导性。