通过寻找基于任务平板区改进多任务学习 (Improving Multi-task Learning via Seeking Task-based Flat Regions)

Multi-Task Learning (MTL) is a widely-used and powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone. Compared to training tasks separately, MTL significantly reduces computational costs, improves data efficiency, and potentially enhances model performance by leveraging knowledge across tasks. Hence, it has been adopted in a variety of applications, ranging from computer vision to natural language processing and speech recognition. Among them, there is an emerging line of work in MTL that focuses on manipulating the task gradient to derive an ultimate gradient descent direction to benefit all tasks. Despite achieving impressive results on many benchmarks, directly applying these approaches without using appropriate regularization techniques might lead to suboptimal solutions on real-world problems. In particular, standard training that minimizes the empirical loss on the training data can easily suffer from overfitting to low-resource tasks or be spoiled by noisy-labeled ones, which can cause negative transfer between tasks and overall performance drop. To alleviate such problems, we propose to leverage a recently introduced training method, named Sharpness-aware Minimization, which can enhance model generalization ability on single-task learning. Accordingly, we present a novel MTL training methodology, encouraging the model to find task-based flat minima for coherently improving its generalization capability on all tasks. Finally, we conduct comprehensive experiments on a variety of applications to demonstrate the merit of our proposed approach to existing gradient-based MTL methods, as suggested by our developed theory.

翻译：多任务学习(MTL)是培训深层神经网络的一个广泛使用和强大的学习模式,它使一个骨干能够学习不止一个目标。与单独培训任务相比,MTL显著降低计算成本,提高数据效率,并有可能通过利用不同任务的知识提高示范性业绩。因此,它被采用于从计算机视野到自然语言处理和语音识别等多种应用中。其中,在MTL中出现了一种新兴的工作路线,重点是操纵任务梯度,以得出最终的梯度下降方向,使所有任务受益。尽管在许多基准上取得了令人印象深刻的成果,但直接采用这些方法而不使用适当的正规化技术,可能会导致在现实世界问题上出现不理想的解决办法。特别是,标准培训可以最大限度地减少培训数据上的经验损失,因为过于适应于低资源任务,或者被杂乱的标签令任务破坏。在任务和总体绩效下降之间造成负转移。为了缓解这些问题,我们提议利用最近采用的一种培训方法,称为Sharpness-aware limination,这可以加强单一任务理论应用的模型能力。因此,我们目前采用一种新式的渐进式的学习方法,我们目前采用一种新的方法,用来来展示一种改进现有的方法。

相关内容

多任务学习

关注 161

多任务学习（MTL）是机器学习的一个子领域，可以同时解决多个学习任务，同时利用各个任务之间的共性和差异。与单独训练模型相比，这可以提高特定任务模型的学习效率和预测准确性。多任务学习是归纳传递的一种方法，它通过将相关任务的训练信号中包含的域信息用作归纳偏差来提高泛化能力。通过使用共享表示形式并行学习任务来实现,每个任务所学的知识可以帮助更好地学习其它任务。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日