Even as pre-trained language models share a semantic encoder, natural language understanding suffers from a diversity of output schemas. In this paper, we propose UBERT, a unified bidirectional language understanding model based on BERT framework, which can universally model the training objects of different NLU tasks through a biaffine network. Specifically, UBERT encodes prior knowledge from various aspects, uniformly constructing learning representations across multiple NLU tasks, which is conducive to enhancing the ability to capture common semantic understanding. By using the biaffine to model scores pair of the start and end position of the original text, various classification and extraction structures can be converted into a universal, span-decoding approach. Experiments show that UBERT wins the first price in the 2022 AIWIN - World Artificial Intelligence Innovation Competition, Chinese insurance few-shot multi-task track, and realizes the unification of extensive information extraction and linguistic reasoning tasks.
翻译:即使在经过培训的语文模式共用一种语义编码器时,自然语言理解也存在多种产出模式。在本文中,我们提议采用基于英语和德语框架的统一双向语言理解模式,即UBERT,该模式可通过双胞胎网络,普遍模拟不同语言语言联盟任务的培训对象。具体地说,语言和英语培训模式将来自各方面的先前知识编码,在多种语言和语言之间统一构建学习模式,这有助于增强获取共同语义理解的能力。通过使用双胞胎来模拟原始文本的起始和结尾位置的分数,各种分类和提取结构可以转换成一种普遍的跨线解码方法。实验表明,UBERT赢得了2022年AIWINWIN-世界人工智能创新竞赛、中国保险少发多功能赛的第一个价格,并实现了广泛信息提取和语言推理任务的统一。