The ability to learn different tasks sequentially is essential to the development of artificial intelligence. In general, neural networks lack this capability, the major obstacle being catastrophic forgetting. It occurs when the incrementally available information from non-stationary data distributions is continually acquired, disrupting what the model has already learned. Our approach remembers old tasks by projecting the representations of new tasks close to that of old tasks while keeping the decision boundaries unchanged. We employ the center loss as a regularization penalty that enforces new tasks' features to have the same class centers as old tasks and makes the features highly discriminative. This, in turn, leads to the least forgetting of already learned information. This method is easy to implement, requires minimal computational and memory overhead, and allows the neural network to maintain high performance across many sequentially encountered tasks. We also demonstrate that using the center loss in conjunction with the memory replay outperforms other replay-based strategies. Along with standard MNIST variants for continual learning, we apply our method to continual domain adaptation scenarios with the Digits and PACS datasets. We demonstrate that our approach is scalable, effective, and gives competitive performance compared to state-of-the-art continual learning methods.
翻译:依次学习不同任务的能力是开发人工智能的关键。 一般来说, 神经网络缺乏这种能力, 主要障碍是灾难性的忘记。 当从非静止数据分布中逐渐获得的信息不断获得时, 就会发生这种情况, 从而干扰模型已经学到的东西。 我们的方法通过预测与旧任务相近的新任务的表述方式来记住旧的任务。 我们使用中心损失作为正规化惩罚, 强制执行新任务的特点, 与旧任务相同, 并且使这些特点具有高度的差别性。 这反过来又导致最小的忘记已经学习的信息。 这种方法很容易实施, 需要最小的计算和记忆管理, 并允许神经网络在很多相继遇到的任务中保持高性能。 我们还表明, 将中心损失与记忆重现结合, 优于其他重现战略。 我们运用了标准 MNIST 变式来持续学习, 我们使用的方法在Digits 和 PACS 数据集中持续域适应情景。 我们证明我们的方法是可缩放的, 有效, 并且与州级学习方法相比, 具有竞争力。