In the context of continual learning, acquiring new knowledge while maintaining previous knowledge presents a significant challenge. Existing methods often use experience replay techniques that store a small portion of previous task data for training. In experience replay approaches, data augmentation has emerged as a promising strategy to further improve the model performance by mixing limited previous task data with sufficient current task data. However, we theoretically and empirically analyze that training with mixed samples from random sample pairs may harm the knowledge of previous tasks and cause greater catastrophic forgetting. We then propose GradMix, a robust data augmentation method specifically designed for mitigating catastrophic forgetting in class-incremental learning. GradMix performs gradient-based selective mixup using a class-based criterion that mixes only samples from helpful class pairs and not from detrimental class pairs for reducing catastrophic forgetting. Our experiments on various real datasets show that GradMix outperforms data augmentation baselines in accuracy by minimizing the forgetting of previous knowledge.
翻译:在持续学习背景下,获取新知识的同时保持已有知识是一项重大挑战。现有方法通常采用经验回放技术,存储少量先前任务数据用于训练。在经验回放方法中,数据增强已成为一种有前景的策略,通过将有限的先前任务数据与充足的当前任务数据进行混合,进一步提升模型性能。然而,我们通过理论分析和实验验证发现,使用随机样本对生成的混合样本进行训练可能会损害先前任务的知识,并导致更严重的灾难性遗忘。为此,我们提出GradMix——一种专门为缓解类增量学习中的灾难性遗忘而设计的鲁棒数据增强方法。GradMix采用基于类别的准则执行基于梯度的选择性混合,仅混合来自有益类别对的样本,而避免混合来自有害类别对的样本,从而减少灾难性遗忘。我们在多个真实数据集上的实验表明,GradMix通过最小化对先前知识的遗忘,在准确率上优于各类数据增强基线方法。