基于正规化的继续不断学习的优化和普及:损失接近度观察点 (Optimization and Generalization of Regularization-Based Continual Learning: a Loss Approximation Viewpoint)

Neural networks have achieved remarkable success in many cognitive tasks. However, when they are trained sequentially on multiple tasks without access to old data, their performance on early tasks tend to drop significantly. This problem is often referred to as catastrophic forgetting, a key challenge in continual learning of neural networks. The regularization-based approach is one of the primary classes of methods to alleviate catastrophic forgetting. In this paper, we provide a novel viewpoint of regularization-based continual learning by formulating it as a second-order Taylor approximation of the loss function of each task. This viewpoint leads to a unified framework that can be instantiated to derive many existing algorithms such as Elastic Weight Consolidation and Kronecker factored Laplace approximation. Based on this viewpoint, we study the optimization aspects (i.e., convergence) as well as generalization properties (i.e., finite-sample guarantees) of regularization-based continual learning. Our theoretical results indicate the importance of accurate approximation of the Hessian matrix. The experimental results on several benchmarks provide empirical validation of our theoretical findings.

翻译：在许多认知任务中,神经网络取得了显著的成功。然而,当他们连续接受多重任务的培训而没有获得旧数据时,早期任务的表现往往会显著下降。这个问题常常被称为灾难性的遗忘,这是不断学习神经网络的一个关键挑战。基于正规化的方法是缓解灾难性遗忘的主要方法类别之一。在本文件中,我们提供了基于正规化的持续学习的新观点,将正规化作为每个任务损失函数的第二阶段泰勒近似值。这一观点导致形成一个统一框架,可以即时得出许多现有的算法,如Elastic Weight 聚合和Kronecker 系数拉贝近似。基于这一观点,我们研究了基于正规化的持续学习的优化方面(即趋同)和一般化特性(即有限抽样保证)。我们的理论结果表明,黑森矩阵的准确近似值非常重要。几个基准的实验结果为我们理论发现提供了经验验证。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

93+阅读 · 2020年5月6日

【CVPR2020】物体实例持续学习，Continual Learning of Object Instances

专知会员服务

32+阅读 · 2020年4月26日

【伯克利】元学习的元基线，A New Meta-Baseline for Few-Shot Learning

专知会员服务

67+阅读 · 2020年3月28日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日