Convolutional Neural Networks experience catastrophic forgetting when optimized on a sequence of learning problems: as they meet the objective of the current training examples, their performance on previous tasks drops drastically. In this work, we introduce a novel framework to tackle this problem with conditional computation. We equip each convolutional layer with task-specific gating modules, selecting which filters to apply on the given input. This way, we achieve two appealing properties. Firstly, the execution patterns of the gates allow to identify and protect important filters, ensuring no loss in the performance of the model for previously learned tasks. Secondly, by using a sparsity objective, we can promote the selection of a limited set of kernels, allowing to retain sufficient model capacity to digest new tasks.Existing solutions require, at test time, awareness of the task to which each example belongs to. This knowledge, however, may not be available in many practical scenarios. Therefore, we additionally introduce a task classifier that predicts the task label of each example, to deal with settings in which a task oracle is not available. We validate our proposal on four continual learning datasets. Results show that our model consistently outperforms existing methods both in the presence and the absence of a task oracle. Notably, on Split SVHN and Imagenet-50 datasets, our model yields up to 23.98% and 17.42% improvement in accuracy w.r.t. competing methods.
翻译:进化神经网络在优化一系列学习问题时会遭遇灾难性的忘记: 当它们达到了当前培训范例的目标时, 它们在先前任务上的表现会急剧下降。 在这项工作中, 我们引入了一个新框架来用有条件计算来解决这个问题。 我们给每个进化层配备了任务特定的模版, 选择了哪些过滤器可以应用给给定输入。 这样, 我们就能实现两个有吸引力的属性。 首先, 大门的执行模式允许识别和保护重要的过滤器, 确保先前学到的任务的模型的性能不会损失。 第二, 使用一个宽度目标, 我们就可以促进选择一套有限的内核, 从而能够保留足够的模型能力来消化新任务。 执行解决方案需要测试每个进化的模块模块模块模块。 然而, 在许多实际的情景中, 可能无法提供这种知识。 因此, 我们还会引入一个任务分类器来预测每个示例的任务标签, 以便处理没有任务或奇迹的环境。 我们验证了我们关于四个持续学习数据集的建议, 允许保留足够的模型能力来消化新任务。 结果显示我们模型的准确性, 和图像输出方法都持续地显示我们现有的模型和图像格式。