Neural networks tend to forget previously learned knowledge when continuously learning on datasets with varying distributions, a phenomenon known as catastrophic forgetting. More significant distribution shifts among datasets lead to more forgetting. Recently, parameter-isolation-based approaches have shown great potential in overcoming forgetting with significant distribution shifts. However, they suffer from poor generalization as they fix the neural path for each dataset during training and require dataset labels during inference. In addition, they do not support backward knowledge transfer as they prioritize past data over future ones. In this paper, we propose a new adaptive learning method, named AdaptCL, that fully reuses and grows on learned parameters to overcome catastrophic forgetting and allows the positive backward transfer without requiring dataset labels. Our proposed technique adaptively grows on the same neural path by allowing optimal reuse of frozen parameters. Besides, it uses parameter-level data-driven pruning to assign equal priority to the data. We conduct extensive experiments on MNIST Variants, DomainNet, and Food Freshness Detection datasets under different intensities of distribution shifts without requiring dataset labels. Results demonstrate that our proposed method is superior to alternative baselines in minimizing forgetting and enabling positive backward knowledge transfer.
翻译:神经网络在不断学习分布各异的数据集时,往往会忘记先前学到的知识,这是一种被称为灾难性的遗忘现象。在数据集中,更显著的分布变化导致更多的人忘记。最近,基于参数的孤立化方法显示在克服遗忘方面有巨大的潜力,因为分布变化很大。然而,由于在培训期间为每个数据集修定神经路径,在推断过程中需要数据集标签,因此这些网络受到一般化不足的影响。此外,这些网络并不支持落后的知识转让,因为它们优先考虑过去的数据,而不优先考虑未来的数据。在本文中,我们提出了一种新的适应性学习方法,名为“适应CL”,充分再利用和增加学到的参数,以克服灾难性的遗忘,并允许在不需要数据集标签的情况下进行积极的后退转移。我们提议的技术通过允许最佳地重新利用冻结参数,在相同的神经路径上发展。此外,它们使用参数级数据驱动的剪裁来赋予数据同等的优先地位。我们还对MNIST 变量、 DomainNet 和食品新鲜度探测数据集进行了广泛的实验。我们提出的方法在不要求数据设置数据设置标签的不同强度下,在不同的分配转移的强度下,充分重新使用。结果表明我们的拟议方法有利于向后最晚的转移。