深层学习的人工神经可变性:过度改造、噪音记忆化和灾难性遗忘 (Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting)

from arxiv, Accepted by Neural Computation, MIT Press;20 pages; 13 figures; Key Words: Neural Variability, Neuroscience, Deep Learning, Label Noise, Catastrophic Forgetting

Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the {\it neural variability}, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a {\it neural variable risk minimization} (NVRM) framework and {\it neural variable optimizers} to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. \footnote{Code: \url{https://github.com/zeke-xie/artificial-neural-variability-for-deep-learning}.

翻译：深层次的学习往往受到自然神经系统很少存在的两个严重问题的批评:过度适应和灾难性的遗忘。甚至可以将随机标签的数据(在试样标签配对后几乎没有什么知识)混为一谈。当深层次的网络通过接纳新任务不断学习时,通常会很快地翻转从先前任务中学到的知识。被称为“ 神经神经变异 ”, 在神经科学中众所周知, 人类大脑的反应即使响应同样的刺激, 也会有很大的变异性。这个机制平衡自然神经系统运动学习的准确性和可塑性/灵活性。因此它激励我们设计一个类似的机制, 名为“ 人工神经变异性” (ANV), 帮助人工神经网络从“ 自然” 神经网络中学习一些优势。我们严格地证明, 人工智能作为培训数据与所学模型之间相互信息的隐含的常规调节器。这在理论上可以保证大脑的广度得到严格的改进, 标签噪音的坚固性, 以及灾难性的忘记。我们随后设计了一个可变的神经风险最小化风险最小化(NVRM) 的网络框架和可变式的模型的模型化结构,可以证明常规的模型的校正的校正的校正的校正结构, 。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

【限时开放书】深度学习导论，196页pdf，Introduction to Deep Learning

专知会员服务

68+阅读 · 2020年7月15日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日