In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.
翻译:在本文中,我们提出了一个新的多任务学习架构,它包含了关注机制的最新进展。我们的方法,即多任务关注网络(MTAN),由包含全球特征集合的单一共享网络组成,以及任务特定软关注模块,这些模块以端对端方式进行训练。这些关注模块允许从全球集合中学习任务特定特征,同时允许在不同任务中共享特征。该架构可以建在任何进料前神经网络上,可以简单实施,而且参数效率很高。在CityScapes数据集上进行的实验表明,我们的方法在单任务和多任务学习中都超越了几个基线,而且对于多任务损失函数中的各种加权计划来说也更加强大。我们进一步通过一系列任务复杂性的实验探索我们的方法的有效性,并展示我们的方法与基准相比,任务复杂性如何很好。