Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay mechanism. We show that naive functional regularization, also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in different scenarios and on multiple datasets.
翻译:最近自我监督的学习方法能够学习高质量的图像表现,并正在通过监督方法缩小差距。然而,这些方法无法逐步获得新的知识 -- -- 事实上,这些方法大多仅用作与ID数据相比的培训前阶段。在这项工作中,我们调查了连续学习制度中的自我监督方法,而没有任何复放机制。我们表明,天真的功能正规化,又称特征蒸馏,导致低可塑性,并限制了持续学习的绩效。相反,我们提议了预期功能常规化,在这种常规化中,一个单独的时间预测网络确保新学的功能空间保存前一个功能,同时允许学习新特征。这防止了在保持学习者的可塑性的同时忘记了该学习者。与用于自我监督的其他渐进学习方法进行比较表明,我们的方法在不同情景和多个数据集中取得了竞争性的绩效。