Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer, a key feature of human learning. Though, state of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks. Also, continual learning, that adds the temporal aspect to multitask, is often focused to the study of common pitfalls such as catastrophic forgetting instead of being studied at a large scale as a critical component to build the next generation artificial intelligence.We propose an evolutionary method capable of generating large scale multitask models that support the dynamic addition of new tasks. The generated multitask models are sparsely activated and integrates a task-based routing that guarantees bounded compute cost and fewer added parameters per task as the model expands.The proposed method relies on a knowledge compartmentalization technique to achieve immunity against catastrophic forgetting and other common pitfalls such as gradient interference and negative transfer. We demonstrate empirically that the proposed method can jointly solve and achieve competitive results on 69public image classification tasks, for example improving the state of the art on a competitive benchmark such as cifar10 by achieving a 15% relative error reduction compared to the best model trained on public data.
翻译:多任务学习假设,能够从多重任务中学习的模型可以通过知识转让提高质量和效率,这是人类学习的一个关键特征。尽管最先进的多任务模型依靠对每项任务的高度定制以及杠杆规模和数据规模,而不是扩大任务数量。此外,不断学习,这增加了多任务的时间方面,往往侧重于研究共同的陷阱,如灾难性遗忘,而不是大规模研究,作为建设下一代人工智能的关键组成部分。我们提议一种渐进方法,能够产生大规模多任务模型,支持动态增加新任务。生成的多任务模型的功能分散,并整合基于任务的路径,保证有约束的计算成本,随着模型的扩展,每个任务增加的参数较少。拟议方法依靠知识分层化技术,实现对灾难性遗忘的豁免,以及诸如梯度干扰和负转移等其他常见的陷阱。我们从经验上证明,拟议的方法可以共同解决69项公共图像分类任务并取得竞争性结果,例如,通过实现经过培训的15项相对错误的减少公共数据,从而改进关于竞争性基准的艺术状况,例如,例如,Scifar 10。