In deep neural networks, the spectral norm of the Jacobian of a layer bounds the factor by which the norm of a signal changes during forward/backward propagation. Spectral norm regularizations have been shown to improve generalization, robustness and optimization of deep learning methods. Existing methods to compute the spectral norm of convolution layers either rely on heuristics that are efficient in computation but lack guarantees or are theoretically-sound but computationally expensive. In this work, we obtain the best of both worlds by deriving {\it four} provable upper bounds on the spectral norm of a standard 2D multi-channel convolution layer. These bounds are differentiable and can be computed efficiently during training with negligible overhead. One of these bounds is in fact the popular heuristic method of Miyato et al. (multiplied by a constant factor depending on filter sizes). Each of these four bounds can achieve the tightest gap depending on convolution filters. Thus, we propose to use the minimum of these four bounds as a tight, differentiable and efficient upper bound on the spectral norm of convolution layers. We show that our spectral bound is an effective regularizer and can be used to bound either the lipschitz constant or curvature values (eigenvalues of the Hessian) of neural networks. Through experiments on MNIST and CIFAR-10, we demonstrate the effectiveness of our spectral bound in improving generalization and provable robustness of deep networks.
翻译:在深层神经网络中,一层层的雅各布人的光谱规范约束了在前向/后向传播期间信号变化的规范所根据的因素。光谱规范规范的正规化已证明能够改进深层学习方法的概括性、稳健性和优化性。计算卷发层光谱规范的现有方法要么依赖于在计算上效率高,但缺乏保障,要么在理论上是健全的,但在计算上下也是昂贵的。在这项工作中,我们从标准2D多通道熔化层的光谱规范中找到两个世界的最佳标准。这些规范规范的正规化是不同的,在培训过程中可以有效地进行计算。其中之一是Miyato等人等人的流行性高温方法(根据过滤器大小的常态因素而变异)。这四条线的每一个都能够达到最紧凑的距离,取决于变压过滤器。因此,我们提议使用这四条底线的最深界限作为标准的2D型多层多通道网络的光谱化标准。这些界限是不同的,在培训中可以有效地计算出,在正常的光谱层中,我们可以显示我们正常的光谱层的光谱和直径的光谱。我们可以展示的光谱的光谱。我们可以展示的光谱的光谱上,我们可以展示的光谱的光谱上,我们可以展示的光谱的光谱上可以展示的光谱上可以显示我们。