When different objectives conflict with each other in multi-task learning, gradients begin to interfere and slow convergence, thereby potentially reducing the final model's performance. To address this, we introduce SON-GOKU, a scheduler that computes gradient interference, constructs an interference graph, and then applies greedy graph-coloring to partition tasks into groups that align well with each other. At each training step, only one group (color class) of tasks are activated, and the grouping partition is constantly recomputed as task relationships evolve throughout training. By ensuring that each mini-batch contains only tasks that pull the model in the same direction, our method improves the effectiveness of any underlying multi-task learning optimizer without additional tuning. Since tasks within these groups will update in compatible directions, multi-task learning will improve model performance rather than impede it. Empirical results on six different datasets show that this interference-aware graph-coloring approach consistently outperforms baselines and state-of-the-art multi-task optimizers. We provide extensive theory showing why grouping and sequential updates improve multi-task learning, with guarantees on descent, convergence, and accurately identifying what tasks conflict or align.
翻译:在多任务学习中,当不同目标相互冲突时,梯度开始产生干扰并减缓收敛速度,从而可能降低最终模型的性能。为解决此问题,我们提出了SON-GOKU调度器,该调度器计算梯度干扰,构建干扰图,然后应用贪心图着色算法将任务划分为相互协调的组。在每个训练步骤中,仅激活一组(颜色类)任务,并且随着训练过程中任务关系的演变,分组划分会不断重新计算。通过确保每个小批次仅包含将模型向同一方向拉动的任务,我们的方法无需额外调优即可提升任何底层多任务学习优化器的有效性。由于这些组内的任务将以兼容的方向进行更新,多任务学习将提升而非阻碍模型性能。在六个不同数据集上的实证结果表明,这种基于干扰感知的图着色方法始终优于基线方法和最先进的多任务优化器。我们提供了详尽的理论分析,阐明了分组和顺序更新为何能改进多任务学习,并给出了关于下降性、收敛性以及准确识别任务冲突或对齐的保证。