Sequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge from multiple tasks; however, they suffer from catastrophic forgetting and difficulties in dataset balancing. To address these shortcomings, we propose AdapterFusion, a new two stage learning algorithm that leverages knowledge from multiple tasks. First, in the knowledge extraction stage we learn task specific parameters called adapters, that encapsulate the task-specific information. We then combine the adapters in a separate knowledge composition step. We show that by separating the two stages, i.e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner. We empirically evaluate AdapterFusion on 16 diverse NLU tasks, and find that it effectively combines various types of knowledge at different layers of the model. We show that our approach outperforms traditional strategies such as full fine-tuning as well as multi-task learning. Our code and adapters are available at AdapterHub.ml.
翻译:序列微调和多任务学习是旨在纳入来自多重任务的知识的方法;然而,它们却遭受灾难性的遗忘,而且难以平衡数据。为了解决这些缺陷,我们提议了适应者Fusion,这是一个新的两个阶段的学习算法,利用多重任务的知识。首先,在知识提取阶段,我们学习了称为适应器的具体任务参数,以包罗具体任务的信息。然后,我们把适应器合并到一个单独的知识构成步骤中。我们通过将两个阶段(即知识提取和知识构成)分开来显示,分类器能够有效地利用从多重任务中学习的表述,以非破坏性的方式加以利用。我们从经验上评估了16项不同的国家实验室任务中的适应者Fusion,发现它有效地结合了模型不同层次的各种知识。我们展示了我们的方法超越了诸如全面微调和多任务学习等传统战略。我们的代码和适应器可以在适应者Hub.ml获得。