Few-Shot Class Incremental Learning (FSCIL) is a challenging continual learning task, where limited training examples are available during several learning sessions. To succeed in this task, it is necessary to avoid over-fitting new classes caused by biased distributions in the few-shot training sets. The general approach to address this issue involves enhancing the representational capability of a pre-defined backbone architecture by adding special modules for backward compatibility with older classes. However, this approach has not yet solved the dilemma of ensuring high classification accuracy over time while reducing the gap between the performance obtained on larger training sets and the smaller ones. In this work, we propose an alternative approach called Continual Parameter-Efficient CLIP (CPE-CLIP) to reduce the loss of information between different learning sessions. Instead of adapting additional modules to address information loss, we leverage the vast knowledge acquired by CLIP in large-scale pre-training and its effectiveness in generalizing to new concepts. Our approach is multimodal and parameter-efficient, relying on learnable prompts for both the language and vision encoders to enable transfer learning across sessions. We also introduce prompt regularization to improve performance and prevent forgetting. Our experimental results demonstrate that CPE-CLIP significantly improves FSCIL performance compared to state-of-the-art proposals while also drastically reducing the number of learnable parameters and training costs.
翻译:少少片级增级学习是一项具有挑战性的不断学习任务,在几个学习课程中,现有培训实例有限。为了成功完成这一任务,有必要避免因少数培训组中分布偏差而造成新班过多的新班。解决这一问题的一般办法包括增加与年长班级的落后兼容性的特殊模块,以此提高预先界定的骨干结构的代表性能力;然而,这一办法尚未解决确保高分类准确性,同时减少在较大培训组和较小培训组之间取得的业绩差距的两难困境。在这项工作中,我们提议了一种替代办法,称为持续参数有效CP-E-CLIP(CP-E-CLIP),以减少不同学习班之间信息损失。我们不是调整其他模块以解决信息损失,而是利用CLIP在大规模培训前获得的大量知识及其在普及新概念方面的效力。我们的方法是多式和有参数效率的,依靠语言和愿景编组的可学习提示,以便能够在不同的课程之间转移学习。我们还迅速实行正规化,以便改进业绩参数,同时防止大幅降低CLIP的学习成本。</s>