Recent approaches to multi-task learning (MTL) have focused on modelling connections between tasks at the decoder level. This leads to a tight coupling between tasks, which need retraining if a new task is inserted or removed. We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining. We propose Medusa to realize this goal, designing task heads with dual attention mechanisms. The shared feature attention masks relevant backbone features for each task, allowing it to learn a generic representation. Meanwhile, a novel Multi-Scale Attention head allows the network to better combine per-task features from different scales when making the final prediction. We show the effectiveness of Medusa in UFL (+13.18% improvement), while maintaining MTL performance and being 25% more efficient than previous approaches.
翻译:近期多任务学习方法(MTL)侧重于在解码器层面的任务之间建立建模关系,这导致任务之间紧密地结合,如果插入或删除新的任务,需要再培训。我们争辩说,MTL是普及特征学习(UFL)的跳板,这是学习通用特征学习(UFL)的能力,可以适用于新的任务,而无需再培训。我们建议Medusa实现这一目标,设计任务主管,同时设置双重关注机制。共同关注掩盖了每项任务的相关主干特征,使其能够学习通用代表。同时,新的多任务关注负责人让网络在作出最后预测时更好地将不同规模的每个任务特征结合起来。我们展示了Medusa在UDL(+13.18%的改进)中的有效性,同时保持MTL的性能,并且比以前的方法更有效率25%。