Multi-task learning has the potential to improve generalization by maximizing positive transfer between tasks while reducing task interference. Fully achieving this potential is hindered by manually designed architectures that remain static throughout training. On the contrary, learning in the brain occurs through structural changes that are in tandem with changes in synaptic strength. Thus, we propose \textit{Multi-Task Structural Learning (MTSL)} that simultaneously learns the multi-task architecture and its parameters. MTSL begins with an identical single-task network for each task and alternates between a task-learning phase and a structural-learning phase. In the task learning phase, each network specializes in the corresponding task. In each of the structural learning phases, starting from the earliest layer, locally similar task layers first transfer their knowledge to a newly created group layer before being removed. MTSL then uses the group layer in place of the corresponding removed task layers and moves on to the next layers. Our empirical results show that MTSL achieves competitive generalization with various baselines and improves robustness to out-of-distribution data.
翻译:多任务学习具有最大化任务之间正向迁移并减少任务干扰从而提高泛化能力的潜力。通过手动设计的静态网络结构制约了多任务学习的发展。相比之下,大脑中的学习通过结构变化和突触强度的变化而发生。因此,我们提出了一种称为 MTSL 的多任务结构学习方法,该方法同时学习多任务网络结构和其参数。MTSL 对于每个任务从相同的单任务网络开始,并在任务学习阶段和结构学习阶段之间交替进行。在任务学习阶段,每个网络专门用于相应的任务。在结构学习阶段中,从最早的层开始,首先是相似的任务层将其知识传输到新创建的组层,并在删除之前被除去。然后,MTSL 使用组层代替相应的已删除的任务层,并移动到下一层。我们的实证结果表明,MTSL 与各种基线方法相比取得了竞争力的泛化能力,并提高了对于分布外数据的鲁棒性。