广神经网络 (Wide Neural Networks Forget Less Catastrophically)

A growing body of research in continual learning is devoted to overcoming the "Catastrophic Forgetting" of neural networks by designing new algorithms that are more robust to the distribution shifts. While the recent progress in continual learning literature is encouraging, our understanding of what properties of neural networks contribute to catastrophic forgetting is still limited. To address this, instead of focusing on continual learning algorithms, in this work, we focus on the model itself and study the impact of "width" of the neural network architecture on catastrophic forgetting, and show that width has a surprisingly significant effect on forgetting. To explain this effect, we study the learning dynamics of the network from various perspectives such as gradient norm and sparsity, orthogonalization, and lazy training regime. We provide potential explanations that are consistent with the empirical results across different architectures and continual learning benchmarks.

翻译：不断学习的越来越多的研究致力于通过设计对分布变化更加有力的新算法来克服神经网络的“灾难式遗忘”现象。虽然不断学习文学最近的进展令人鼓舞,但我们对神经网络的特性导致灾难性的遗忘的理解仍然有限。要解决这个问题,我们不注重持续学习算法,而是在这项工作中注重模型本身,研究神经网络结构的“宽度”对灾难性的遗忘的影响,并表明宽度对遗忘有着惊人的重大影响。为了解释这一影响,我们从梯度规范和松动、孔径化和懒惰的培训制度等不同角度研究网络的学习动态。我们提供与不同结构的实证结果和持续学习基准相一致的潜在解释。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2020-MIT】子图神经网络，Subgraph Neural Networks

专知会员服务

46+阅读 · 2020年9月28日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【ICML2020-斯坦福Facebook-何恺明】神经网络图结构，Graph Structure of Neural Networks

专知会员服务

57+阅读 · 2020年7月14日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日