The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks. A standard multi-task learning objective is to minimize the average loss across all tasks. While straightforward, using this objective often results in much worse final performance for each task than learning them independently. A major challenge in optimizing a multi-task model is the conflicting gradients, where gradients of different task objectives are not well aligned so that following the average gradient direction can be detrimental to specific tasks' performance. Previous work has proposed several heuristics to manipulate the task gradients for mitigating this problem. But most of them lack convergence guarantee and/or could converge to any Pareto-stationary point. In this paper, we introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function, while leveraging the worst local improvement of individual tasks to regularize the algorithm trajectory. CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss. It includes the regular gradient descent (GD) and the multiple gradient descent algorithm (MGDA) in the multi-objective optimization (MOO) literature as special cases. On a series of challenging multi-task supervised learning and reinforcement learning tasks, CAGrad achieves improved performance over prior state-of-the-art multi-objective gradient manipulation methods.
翻译:多任务学习的目标是,通过共享不同任务组合的模型结构,使学习比单一任务学习更有效率。标准的多任务学习目标是最大限度地减少所有任务的平均损失。虽然简单明了,但使用这个目标往往导致每项任务的最后业绩比独立学习要差得多。在优化多任务模式方面,一个重大挑战是相互矛盾的梯度,不同任务目标的梯度没有很好地调整,因此,遵循平均梯度方向可能会损害具体任务的业绩。以前的工作曾提出一些超常性学,以操纵任务梯度来缓解这一问题。但大多数任务梯度缺乏趋同保证和/或可能汇合到任何Pareto静止点。在本文中,我们引入了避免冲突梯度梯度的梯度下降(CAGGA),同时利用地方上最差的改进任务来规范算法轨迹。CAGrad将目标自动地和仍然可以比平均损失最小化。它包括经常性梯度下降和多梯度梯度下降算法(MGDADA),在多目标多目的多目的多级强化的多级(MGDAGAGA-O)中,在前的升级中学习一个特殊的成绩分析中,并改进了CAGAGAGAGAG-S-AGAG-S-S-AGAG-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-AG-S-S-S-S-S-S-S-S-S-S-S-S-S-AGDLAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SAR-SAR-SAR-SAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S