In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks. We show that TAWT is easy to implement, is computationally efficient, requires little hyperparameter tuning, and enjoys non-asymptotic learning-theoretic guarantees. The effectiveness of TAWT is corroborated through extensive experiments with BERT on four sequence tagging tasks in natural language processing (NLP), including part-of-speech (PoS) tagging, chunking, predicate detection, and named entity recognition (NER). As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning, such as the choice of the source data and the impact of fine-tuning
翻译:在本文中,我们引入了目标软件加权培训(TAWT),这是跨任务学习的加权培训算法,其基础是最大限度地减少源和目标任务与目标任务之间的代表性任务距离。我们表明,TAWT容易执行,计算效率很高,需要的超参数微分微调很少,并享有非简易学习理论的保障。TAWT的有效性通过与BERT就自然语言处理(NLP)的四个顺序标记任务进行的广泛实验得到证实,包括部分语音标记、块块、上游探测和名称实体识别(NER)。作为副产品,拟议的基于代表性的任务距离允许人们从理论上有原则地解释跨任务学习的几个关键方面,例如源数据的选择和微调的影响。