Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the {\it neural variability}, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a {\it neural variable risk minimization} (NVRM) framework and {\it neural variable optimizers} to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. \footnote{Code: \url{https://github.com/zeke-xie/artificial-neural-variability-for-deep-learning}.
翻译:深层次的学习往往受到自然神经系统很少存在的两个严重问题的批评:过度适应和灾难性的遗忘。 甚至可以将随机标签的数据(在试样标签配对后几乎没有什么知识)混为一谈。 当深层次的网络通过接纳新任务不断学习时,通常会很快地翻转从先前任务中学到的知识。 被称为“ 神经神经变异 ”, 在神经科学中众所周知, 人类大脑的反应即使响应同样的刺激, 也会有很大的变异性。 这个机制平衡自然神经系统运动学习的准确性和可塑性/灵活性。 因此它激励我们设计一个类似的机制, 名为“ 人工神经变异性” (ANV), 帮助人工神经网络从“ 自然” 神经网络中学习一些优势。 我们严格地证明, 人工智能作为培训数据与所学模型之间相互信息的隐含的常规调节器。 这在理论上可以保证大脑的广度得到严格的改进, 标签噪音的坚固性, 以及灾难性的忘记。 我们随后设计了一个可变的神经风险最小化风险最小化(NVRM) 的网络框架和可变式的模型的模型化结构,可以证明常规的模型的校正的校正的校正的校正结构, 。