BERTWEAVER:利用湿率转换使基于变换模型的终身学习成为可能 (BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models)

Recent developments in transfer learning have boosted the advancements in natural language processing tasks. The performance is, however, dependent on high-quality, manually annotated training data. Especially in the biomedical domain, it has been shown that one training corpus is not enough to learn generic models that are able to efficiently predict on new data. Therefore, state-of-the-art models need the ability of lifelong learning in order to improve performance as soon as new data are available - without the need of retraining the whole model from scratch. We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model, thereby reducing catastrophic forgetting. We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once, while being computationally more efficient. Because there is no need of data sharing, the presented method is also easily applicable to federated learning settings and can for example be beneficial for the mining of electronic health records from different clinics.

翻译：最近转让学习的发展促进了自然语言处理任务的进展,但是,这种成绩取决于高质量的人工附加说明的培训数据,特别是在生物医学领域,已经表明,一个培训单元不足以学习能够有效预测新数据的一般模型,因此,最先进的模型需要终身学习的能力,以便一旦获得新的数据,就可以提高工作成绩,而无需从头到尾再培训整个模型。我们提出了WEAVER,这是一种简单而有效的后处理方法,将旧知识注入新模型,从而减少灾难性的遗忘。我们表明,在一次对所有数据进行合并培训的同时,在计算效率更高的情况下,以连续方式使用WEAVER系统可以产生类似词嵌入分布的结果。由于不需要数据共享,因此所提出的方法也很容易适用于进化学习环境,例如,可以有利于从不同诊所提取电子健康记录。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/