关于减少在线持续学习中代表比例变化的新观点 (New Insights on Reducing Abrupt Representation Change in Online Continual Learning)

In the online continual learning paradigm, agents must learn from a changing distribution while respecting memory and compute constraints. Experience Replay (ER), where a small subset of past data is stored and replayed alongside new data, has emerged as a simple and effective learning strategy. In this work, we focus on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream, and new classes must be distinguished from previous ones. We shed new light on this question by showing that applying ER causes the newly added classes' representations to overlap significantly with the previous classes, leading to highly disruptive parameter updates. Based on this empirical analysis, we propose a new method which mitigates this issue by shielding the learned representations from drastic adaptation to accommodate new classes. We show that using an asymmetric update rule pushes new classes to adapt to the older ones (rather than the reverse), which is more effective especially at task boundaries, where much of the forgetting typically occurs. Empirical results show significant gains over strong baselines on standard continual learning benchmarks.

翻译：在在线持续学习模式中,代理商必须从变化的分布中学习,同时尊重记忆和计算限制。经验回放(ER)是一个简单而有效的学习战略,在这种模式中,经验回放(ER)是储存和重播少量过去的数据,并成为新数据的一种简单而有效的学习战略。在这项工作中,我们侧重于在接收的数据流中出现先前未观测到的类别时,观察到的数据的表示方式的变化,新类别必须与前类区分开来。我们通过显示应用ER使新增加的分类的表述方式与前类大相重叠,从而导致高度破坏性的参数更新。根据这一经验分析,我们提出了一种新的方法,通过保护学到的表述方式,避免适应新类的急剧调整,来缓解这一问题。我们表明,使用不对称更新规则促使新类别适应旧的类别(而不是逆向),这特别在通常会发生遗忘的情况的任务界限上更为有效。Empricalal结果显示,在标准持续学习基准的强基线上取得了显著的收益。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日