Relational knowledge bases (KBs) are commonly used to represent world knowledge in machines. However, while advantageous for their high degree of precision and interpretability, KBs are usually organized according to manually-defined schemas, which limit their expressiveness and require significant human efforts to engineer and maintain. In this review, we take a natural language processing perspective to these limitations, examining how they may be addressed in part by training deep contextual language models (LMs) to internalize and express relational knowledge in more flexible forms. We propose to organize knowledge representation strategies in LMs by the level of KB supervision provided, from no KB supervision at all to entity- and relation-level supervision. Our contributions are threefold: (1) We provide a high-level, extensible taxonomy for knowledge representation in LMs; (2) Within our taxonomy, we highlight notable models, evaluation tasks, and findings, in order to provide an up-to-date review of current knowledge representation capabilities in LMs; and (3) We suggest future research directions that build upon the complementary aspects of LMs and KBs as knowledge representations.
翻译:关系知识基础(KBs)通常用于在机器中代表世界知识,然而,虽然对于高度精准和可解释性有好处,但KBs通常按照人工定义的系统进行组织,这限制了其表达性,需要大量人力来设计和维持;在本次审查中,我们从自然语言处理角度来看待这些局限性,审查如何通过培训深背景语言模型(LMs)来部分地解决这些局限性,以便以更灵活的方式内化和表达关系知识;我们提议在LMs中组织知识代表战略,在KB监督级别上提供知识代表战略,从没有任何KB监督到实体和关系一级的监督;我们的贡献有三重:(1) 我们为LMs的知识代表提供高层次的、可扩展的分类;(2) 在我们的分类中,我们突出突出突出的模型、评价任务和研究结果,以便对LMs的现有知识代表能力进行最新的审查;(3)我们建议今后的研究方向,以LMs和KBs的补充方面作为知识表述。