持续学习语言模式 (Continual Learning of Language Models)

Language models (LMs) have been instrumental for the rapid advance of natural language processing. This paper studies continual learning of LMs, in particular, continual domain-adaptive pre-training (or continual DAP-training). Existing research has shown that further pre-training an LM using a domain corpus to adapt the LM to the domain can improve the end-task performance in the domain. This paper proposes a novel method to continually DAP-train an LM with a sequence of unlabeled domain corpora to adapt the LM to these domains to improve their end-task performances. The key novelty of our method is a soft-masking mechanism that directly controls the update to the LM. A novel proxy is also proposed to preserve the general knowledge in the original LM. Additionally, it contrasts the representations of the previously learned domain knowledge (including the general knowledge in the pre-trained LM) and the knowledge from the current full network to achieve knowledge integration. The method not only overcomes catastrophic forgetting, but also achieves knowledge transfer to improve end-task performances. Empirical evaluation demonstrates the effectiveness of the proposed method.

翻译：语言模型(LMS)有助于自然语言处理的快速推进。本文研究持续学习LMS,特别是连续的域适应性预备培训(或连续的DAP培训),现有研究表明,进一步培训使用域文使LM适应域的LM进一步培训LM可以改进域内的终极任务性能。本文提出了一种新颖的方法,使DAP不断将无标签域域名的Corbora 域名引入LM,使其适应于这些域名,以改进其最终任务性能。我们方法的关键新颖性是直接控制LM的更新的软造型机制。还提议了一种新的代用来保存原始LM的普通知识。此外,它与以前所学域知识的表述(包括预先训练的LM的一般知识)和目前整个网络的知识实现知识融合的对比。这种方法不仅克服了灾难性的遗忘,而且还实现了知识转让,以改进最终任务性能。Empricalal评价表明拟议方法的有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日