Intermediate task fine-tuning has been shown to culminate in large transfer gains across many NLP tasks. With an abundance of candidate datasets as well as pre-trained language models, it has become infeasible to run the cross-product of all combinations to find the best transfer setting. In this work we first establish that similar sequential fine-tuning gains can be achieved in adapter settings, and subsequently consolidate previously proposed methods that efficiently identify beneficial tasks for intermediate transfer learning. We experiment with a diverse set of 42 intermediate and 11 target English classification, multiple choice, question answering, and sequence tagging tasks. Our results show that efficient embedding based methods that rely solely on the respective datasets outperform computational expensive few-shot fine-tuning approaches. Our best methods achieve an average Regret@3 of less than 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.
翻译:中期任务微调被证明最终导致在很多国家劳工规划任务中大量转移收益。 大量候选数据集以及经过预先培训的语言模型已经无法运行所有组合的交叉产品以找到最佳转移环境。 在这项工作中,我们首先确定在适应器环境中可以实现类似的顺序微调收益,并随后整合先前提出的有效确定中间转移学习有益任务的方法。 我们试验了一套多样的、42个中间和11个目标的英文分类、多种选择、问答和顺序标记任务。 我们的结果显示,有效的嵌入基于方法完全依靠各自的数据集,超越了成本昂贵的微调方法。 我们的最佳方法在所有目标任务中都实现了平均不超过1%的Regret@3, 表明我们能够有效地确定中间培训的最佳数据集。