计算预算下的连续学习：重点在哪里？ (Computationally Budgeted Continual Learning: What Does Matter?)

Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data. Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training. This is unreasonable for applications in-the-wild, where systems are primarily constrained by computational and time budgets, not storage. We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting, where effective memory samples used in training can be implicitly restricted as a consequence of limited computation. We conduct experiments evaluating various CL sampling strategies, distillation losses, and partial fine-tuning on two large-scale datasets, namely ImageNet2K and Continual Google Landmarks V2 in data incremental, class incremental, and time incremental settings. Through extensive experiments amounting to a total of over 1500 GPU-hours, we find that, under compute-constrained setting, traditional CL approaches, with no exception, fail to outperform a simple minimal baseline that samples uniformly from memory. Our conclusions are consistent in a different number of stream time steps, e.g., 20 to 200, and under several computational budgets. This suggests that most existing CL methods are particularly too computationally expensive for realistic budgeted deployment. Code for this project is available at: https://github.com/drimpossible/BudgetCL.

翻译：连续学习（CL）旨在通过保留以前的知识并适应新数据来在流入的数据流上顺序地训练模型。当前的CL文献侧重于对以前看到的数据的有限访问，同时对训练的计算预算没有施加任何约束。这对野外应用系统来说是不合理的，因为这些系统主要受计算和时间预算的约束，而不是存储。我们使用大规模基准来重新审视这个问题，并分析传统CL方法在计算受限环境下的性能，其中由于计算受限，训练中使用的有效内存样本可以被隐含地限制。我们在两个大规模数据集（ImageNet2K和Continual Google Landmarks V2）中进行评估，评估各种CL取样策略、蒸馏损失和部分微调的性能，分别在数据递增、类别递增和时间递增的情况下。通过总计1500多个GPU小时的大量实验，我们发现，在计算受限的情况下，传统的CL方法无一例外地未能在性能上胜过一个简单的最小基线，该基线从内存中均匀地采样。我们的结论在不同数量的流时间步（例如，20到200）和不同的计算预算下都是一致的。这表明，现有的大多数CL方法对于实际的预算化部署而言过于昂贵。此项目的代码可在以下网址找到：https://github.com/drimpossible/BudgetCL。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

不可错过！CMU《机器学习导论》2023课程，Matt Gormley带队讲授，附Slides

专知会员服务

38+阅读 · 2023年2月7日

【ICML2021】核持续学习，Kernel Continual Learning

专知会员服务

32+阅读 · 2021年7月15日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日