We propose a continual learning method which incorporates information from specialized datasets incrementally, by integrating it along the vector field of "generalist" models. The tangent plane to the specialist model acts as a generalist guide and avoids the kind of over-fitting that leads to catastrophic forgetting, while exploiting the convexity of the optimization landscape in the tangent plane. It maintains a small fixed-size memory buffer, as low as 0.4% of the source datasets, which is updated by simple resampling. Our method achieves state-of-the-art across various buffer sizes for different datasets. Specifically, in the class-incremental setting we outperform the existing methods by an average of 26.24% and 28.48%, for Seq-CIFAR-10 and Seq-TinyImageNet respectively. Our method can easily be combined with existing replay-based continual learning methods. When memory buffer constraints are relaxed to allow storage of other metadata such as logits, we attain state-of-the-art accuracy with an error reduction of 36% towards the paragon performance on Seq-CIFAR-10.
翻译:我们建议一种持续学习的方法,通过将专门数据集的信息与“通才”模型的矢量字段整合,逐步纳入专门数据集的信息。 专家模型的正切平面作为通识指南,避免造成灾难性遗忘的过度配置,同时利用正切平面优化景观的混凝土。 它保持一个小的固定大小的内存缓冲,低至源数据集的0.4%,通过简单的抽取加以更新。 我们的方法在不同数据集的缓冲大小中达到了最先进的水平。 具体地说, 在为Seq- CIFAR- 10和Seq-TinyimageNet设定的等级中,我们将现有方法分别以26.24%和28.48%的平均值比方形。 我们的方法可以很容易地与现有的重放持续学习方法相结合。 当记忆缓冲限制放松到允许存储其他元数据,例如logit时,我们达到了最先进的精确度,将36 %的误差降低到Seq- CIFAR- 10的参数性能。