Rehearsal approaches enjoy immense popularity with Continual Learning (CL) practitioners. These methods collect samples from previously encountered data distributions in a small memory buffer; subsequently, they repeatedly optimize on the latter to prevent catastrophic forgetting. This work draws attention to a hidden pitfall of this widespread practice: repeated optimization on a small pool of data inevitably leads to tight and unstable decision boundaries, which are a major hindrance to generalization. To address this issue, we propose Lipschitz-DrivEn Rehearsal (LiDER), a surrogate objective that induces smoothness in the backbone network by constraining its layer-wise Lipschitz constants w.r.t. replay examples. By means of extensive experiments, we show that applying LiDER delivers a stable performance gain to several state-of-the-art rehearsal CL methods across multiple datasets, both in the presence and absence of pre-training. Through additional ablative experiments, we highlight peculiar aspects of buffer overfitting in CL and better characterize the effect produced by LiDER. Code is available at https://github.com/aimagelab/LiDER
翻译:连续学习(CL)实践者对彩排方法极为欢迎,这些方法收集了以前在小型记忆缓冲中曾遇到过的数据分布样本;随后,它们反复优化后一种方法,以防止灾难性的遗忘。这项工作提醒人们注意这一广泛做法的一个隐蔽的陷阱:在少量数据库中反复优化不可避免地导致决定界限的紧张和不稳定,这是普遍化的一大障碍。为了解决这一问题,我们提议Lipschitz-Driiven Rehearsal(LiDER),这是一个代用目标,通过限制其分层Lipschitz常数 w.r.t. replay 实例,使主干网变得平稳。我们通过广泛的实验,显示应用LiDER(LIDER)在多个数据集中,在存在和没有培训前的多个数据集中,给一些最先进的CLL演练方法带来稳定的性能增益。我们通过补充的模拟试验,强调CLLL(L)中缓冲过度的特殊方面,并更好地描述LIDER(LDER)产生的效果。代码可在 https://github.com/aimagelagela/agela/LiDRA/LiDER(http)。