Online continual learning, especially when task identities and task boundaries are unavailable, is a challenging continual learning setting. One representative kind of methods for online continual learning is replay-based methods, in which a replay buffer called memory is maintained to keep a small part of past samples for overcoming catastrophic forgetting. When tackling with online continual learning, most existing replay-based methods focus on single-label problems in which each sample in the data stream has only one label. But multi-label problems may also happen in the online continual learning setting in which each sample may have more than one label. In the online setting with multi-label samples, the class distribution in data stream is typically highly imbalanced, and it is challenging to control class distribution in memory since changing the number of samples belonging to one class may affect the number of samples belonging to other classes. But class distribution in memory is critical for replay-based memory to get good performance, especially when the class distribution in data stream is highly imbalanced. In this paper, we propose a simple but effective method, called optimizing class distribution in memory (OCDM), for multi-label online continual learning. OCDM formulates the memory update mechanism as an optimization problem and updates the memory by solving this problem. Experiments on two widely used multi-label datasets show that OCDM can control the class distribution in memory well and can outperform other state-of-the-art methods.
翻译:在线持续学习,特别是当任务身份和任务界限不存在时,是一个具有挑战性的持续学习环境。 一种具有代表性的在线持续学习方法是一种基于重现的方法,在这种方法中,保持一个称为记忆的缓冲,以保持过去样本中的一小部分,以克服灾难性的遗忘。在处理在线持续学习时,大多数现有的重播方法都侧重于单一标签问题,数据流中的每个样本只有一个标签。但是,多标签问题也可能发生在每个样本可能有一个以上标签的在线持续学习环境中。在多标签样本的在线设置中,数据流中的类分布通常高度不平衡,而且由于改变属于一个类的样本数量,因此很难控制记忆中的类分布。但是,大多数基于重播的重播方法对于重播记忆获得良好的表现至关重要,特别是当数据流中的类分布高度不平衡时。 在本文中,我们建议一种简单但有效的方法,即要求优化记忆中的类分布(OCDM),用于多标签在线学习。 OCDM 将记忆更新机制设计成两个类,作为属于同一类的样本的样本分配模式, 并用其他存储模式来更新。