We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration). Inspired by the masking strategy of BERT, ERNIE is designed to learn language representation enhanced by knowledge masking strategies, which includes entity-level masking and phrase-level masking. Entity-level strategy masks entities which are usually composed of multiple words.Phrase-level strategy masks the whole phrase which is composed of several words standing together as a conceptual unit.Experimental results show that ERNIE outperforms other baseline methods, achieving new state-of-the-art results on five Chinese natural language processing tasks including natural language inference, semantic similarity, named entity recognition, sentiment analysis and question answering. We also demonstrate that ERNIE has more powerful knowledge inference capacity on a cloze test.
翻译:我们展示了一种新型语言代表模式,它由称为ERNIE(通过 kNowledge IntEgriation ) 的知识所强化。在BERT 的蒙面战略的启发下,ERNIE旨在学习由知识掩面战略所强化的语言代表模式,包括实体一级的遮面和语句面面罩。实体一级的战略掩盖了通常由多个词组成的实体。 词层战略掩盖了由几个词组成的整个词组作为一个概念单位组成的整个短语。实验结果显示,ERNIE超越了其他基线方法,在五项中国自然语言处理任务中取得了新的最新结果,包括自然语言推断、语义相似性、名称实体识别、情绪分析和问题回答。我们还表明ERNIE在环球测试上拥有更强大的推论能力。