Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase. An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge. However, none of the existing CIL models can achieve the optimal tradeoff in different data-receiving settings--where typically the training-from-half (TFH) setting needs more stability, but the training-from-scratch (TFS) needs more plasticity. To this end, we design an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori. Specifically, we first introduce the key hyperparameters that influence the trade-off, e.g., knowledge distillation (KD) loss weights, learning rates, and classifier types. Then, we formulate the hyperparameter optimization process as an online Markov Decision Process (MDP) problem and propose a specific algorithm to solve it. We apply local estimated rewards and a classic bandit algorithm Exp3 [4] to address the issues when applying online MDP methods to the CIL protocol. Our method consistently improves top-performing CIL methods in both TFH and TFS settings, e.g., boosting the average accuracy of TFH and TFS by 2.2 percentage points on ImageNet-Full, compared to the state-of-the-art [23].
翻译:分类入门学习(CIL)的目的是培训一个分类模式,而班级数量则会逐步增加,而教类数量则会逐步增加,而CIL的固有挑战是稳定-塑料权衡,即CIL模型应保持稳定,以保留旧知识,保持塑料,吸收新知识,然而,现有的CIL模型中没有一个能够在不同的数据接收环境中实现最佳权衡,通常情况下,从一半起的培训(TFH)设置需要更稳定,但从Snter-scratch(TFS)的培训需要更具有可塑性。为此,我们设计了一种在线学习方法,可以在不事先了解设置的情况下适应性地优化交易。具体地说,我们首先引入了影响交易的关键超参数,例如,知识蒸馏(KD)损失重量、学习率和分类类型。然后,我们将超参数优化进程进程作为在线Markov(MDP)决策程序的问题,并提出了解决该问题的具体算法。我们在TFS-FS-S-SB标准中应用当地估计和典型的缩式分析算法3,在S-H-PLFS-S-S-S-S-pal-practal-IL方法中持续地处理问题。