Leveraging data from multiple tasks, either all at once, or incrementally, to learn one model is an idea that lies at the heart of multi-task and continual learning methods. Ideally, such a model predicts each task more accurately than if the task were trained in isolation. We show using tools in statistical learning theory (i) how tasks can compete for capacity, i.e., including a particular task can deteriorate the accuracy on a given task, and (ii) that the ideal set of tasks that one should train together in order to perform well on a given task is different for different tasks. We develop methods to discover such competition in typical benchmark datasets which suggests that the prevalent practice of training with all tasks leaves performance on the table. This motivates our "Model Zoo", which is a boosting-based algorithm that builds an ensemble of models, each of which is very small, and it is trained on a smaller set of tasks. Model Zoo achieves large gains in prediction accuracy compared to state-of-the-art methods across a variety of existing benchmarks in multi-task and continual learning, as well as more challenging ones of our creation. We also show that even a model trained independently on all tasks outperforms all existing multi-task and continual learning methods.
翻译:从多重任务中利用数据,无论是一次性还是递增地学习一个模型,都是一个处于多任务和持续学习方法核心的理念。理想的情况是,这种模型比孤立地培训任务更精确地预测每项任务。我们展示了统计学习理论中的工具:(一) 任务如何能够竞争能力,包括一项特定任务的准确性会降低特定任务的准确性,以及(二) 理想的任务组合,为了很好地完成某项任务而应共同培训的,对于不同任务来说是不同的。我们开发了在典型的基准数据集中发现这种竞争的方法,这表明所有任务培训的普遍做法都留下业绩。这激励了我们的“莫德尔祖”(Model Zoo),这是一种基于推力的算法,它构建了各种模型的组合,每个模型都非常小,并且就更小的任务组合进行了培训。模型动物在预测准确性方面与最新方法相比,对于不同的任务是巨大的。我们开发了多种任务和持续学习模式的现有基准,我们甚至独立地展示了所有不断学习的方法。