When learning new tasks in a sequential manner, deep neural networks tend to forget tasks that they previously learned, a phenomenon called catastrophic forgetting. Class incremental learning methods aim to address this problem by keeping a memory of a few exemplars from previously learned tasks, and distilling knowledge from them. However, existing methods struggle to balance the performance across classes since they typically overfit the model to the latest task. In our work, we propose to address these challenges with the introduction of a novel methodology of Tangent Kernel for Incremental Learning (TKIL) that achieves class-balanced performance. The approach preserves the representations across classes and balances the accuracy for each class, and as such achieves better overall accuracy and variance. TKIL approach is based on Neural Tangent Kernel (NTK), which describes the convergence behavior of neural networks as a kernel function in the limit of infinite width. In TKIL, the gradients between feature layers are treated as the distance between the representations of these layers and can be defined as Gradients Tangent Kernel loss (GTK loss) such that it is minimized along with averaging weights. This allows TKIL to automatically identify the task and to quickly adapt to it during inference. Experiments on CIFAR-100 and ImageNet datasets with various incremental learning settings show that these strategies allow TKIL to outperform existing state-of-the-art methods.
翻译:当以顺序方式学习新任务时,深神经网络往往会忘记他们以前学到的任务,这是一种被称为灾难性的遗忘现象。 类级递增学习方法旨在解决这个问题,让一些从以前学到的任务中吸取的模范人物记忆起来,并从中提炼知识。 但是,现有的方法很难平衡不同年级的性能,因为它们通常将模型与最新任务相配。 在我们的工作中,我们建议通过采用一种创新的方法来应对这些挑战,即Tangent Kernel增量学习(TKIL),实现阶级平衡的性能。这种方法保留了各个班级的表示,平衡了每个班级的准确性,从而实现了更好的整体准确性和差异。 TKIL 方法以Neural Tangent Kernel (NTKK) (NTK) 为基础, 描述各班级网络的趋同内网络的趋同性动作, 在宽度范围内,将神经神经网络的趋同行为描述为无限宽度的内涵功能功能。 在TKILL, 地层之间的渐变的梯度被视为这些层次之间的距离, 可以被定义为Gradentent Kenneel损失(Gnell) 的损失, 这样随着平均重量的重量而将它被最小化为最小化。