内层多样性减少普遍化差距 (Within-layer Diversity Reduces Generalization Gap)

Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization, where the errors are back-propagated from the last layer back to the first one. At each optimization step, neurons at a given layer receive feedback from neurons belonging to higher layers of the hierarchy. In this paper, we propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage diversity of the activations within the same layer. To this end, we measure the pairwise similarity between the outputs of the neurons and use it to model the layer's overall diversity. By penalizing similarities and promoting diversity, we encourage each neuron to learn a distinctive representation and, thus, to enrich the data representation learned within the layer and to increase the total capacity of the model. We theoretically study how the within-layer activation diversity affects the generalization performance of a neural network and prove that increasing the diversity of hidden activations reduces the estimation error. In addition to the theoretical guarantees, we present an empirical study on three datasets confirming that the proposed approach enhances the performance of state-of-the-art neural network models and decreases the generalization gap.

翻译：神经网络由多层组成, 在一个由梯度优化联合训练的等级结构中排列, 错误从最后一个层回溯到第一个层。在每个优化步骤中, 给定层的神经元从属于较高层次的神经元得到反馈。在本文中, 我们提议用额外的“ 内层” 反馈来补充这种传统的“ 跨层” 反馈, 鼓励同一层内启动的多样化。为此, 我们测量神经元输出的对等相似性, 并用它来模拟该层的整体多样性。通过惩罚相似性和促进多样性, 我们鼓励每个神经元学习独特的代表性, 从而丰富在层内学习的数据代表性, 并增加模型的总能力。我们理论上的研究, 内层激活多样性如何影响神经网络的全局性功能, 并证明增加隐藏激活的多样性会减少估计错误。除了理论保证外, 我们对三个数据集进行实验性研究, 以证实拟议的方法可以加强状态神经网络模型的性能, 并缩小总体差距。