Learning from non-stationary data streams and overcoming catastrophic forgetting still poses a serious challenge for machine learning research. Rather than aiming to improve state-of-the-art, in this work we provide insight into the limits and merits of rehearsal, one of continual learning's most established methods. We hypothesize that models trained sequentially with rehearsal tend to stay in the same low-loss region after a task has finished, but are at risk of overfitting on its sample memory, hence harming generalization. We provide both conceptual and strong empirical evidence on three benchmarks for both behaviors, bringing novel insights into the dynamics of rehearsal and continual learning in general. Finally, we interpret important continual learning works in the light of our findings, allowing for a deeper understanding of their successes.
翻译:从非静止数据流中学习和克服灾难性的遗忘,对于机器学习研究来说,仍然是一项严峻的挑战。 我们不是着眼于改进最新技术,而是在这项工作中深入了解排练的局限性和优点,这是持续学习最常用的方法之一。 我们假设,在任务完成后,通过排练连续培训的模式往往会留在同样的低损失地区,但有可能过分适应其样本记忆,从而损害一般化。 我们为两种行为提供了三个基准的概念和有力的经验证据,给排练和持续学习的动态带来新的洞察力。 最后,我们根据我们的调查结果解释持续学习的重要工作,以便更深入地了解它们的成功。