Pre-trained models learn informative representations on large-scale training data through a self-supervised or supervised learning method, which has achieved promising performance in natural language processing (NLP), computer vision (CV), and cross-modal fields after fine-tuning. These models, however, suffer from poor robustness and lack of interpretability. Pre-trained models with knowledge injection, which we call knowledge enhanced pre-trained models (KEPTMs), possess deep understanding and logical reasoning and introduce interpretability. In this survey, we provide a comprehensive overview of KEPTMs in NLP and CV. We first introduce the progress of pre-trained models and knowledge representation learning. Then we systematically categorize existing KEPTMs from three different perspectives. Finally, we outline some potential directions of KEPTMs for future research.
翻译:培训前模式通过自我监督或监督的学习方法,了解关于大规模培训数据的信息说明,这种方法在自然语言处理、计算机视觉和经过微调的跨模式领域取得了有希望的成绩,但是这些模式缺乏稳健性和可解释性,而知识注入的先培训模式(我们称之为知识强化的先培训模式)具有深刻的理解和逻辑推理,并引入了可解释性。在本次调查中,我们全面概述了在自然语言处理、计算机视觉和跨模式领域的KEPTM。我们首先介绍了预先培训的模式和知识代表学习的进展。然后,我们从三个不同的角度将现有的KEPTM系统分类。最后,我们为今后的研究概述了KEPTM的一些潜在方向。