It is a highly desirable property for deep networks to be robust against small input changes. One popular way to achieve this property is by designing networks with a small Lipschitz constant. In this work, we propose a new technique for constructing such Lipschitz networks that has a number of desirable properties: it can be applied to any linear network layer (fully-connected or convolutional), it provides formal guarantees on the Lipschitz constant, it is easy to implement and efficient to run, and it can be combined with any training objective and optimization method. In fact, our technique is the first one in the literature that achieves all of these properties simultaneously. Our main contribution is a rescaling-based weight matrix parametrization that guarantees each network layer to have a Lipschitz constant of at most 1 and results in the learned weight matrices to be close to orthogonal. Hence we call such layers almost-orthogonal Lipschitz (AOL). Experiments and ablation studies in the context of image classification with certified robust accuracy confirm that AOL layers achieve results that are on par with most existing methods. Yet, they are simpler to implement and more broadly applicable, because they do not require computationally expensive matrix orthogonalization or inversion steps as part of the network architecture. We provide code at https://github.com/berndprach/AOL.
翻译:深网络的特性是非常可取的,因为深网络能够对小投入的变化保持稳健。 实现这一特性的一种流行方式是设计一个小Lipschitz常数的网络。 在这项工作中,我们提出一种新的技术,用于建造具有若干可取特性的Lipschitz网络:它可以应用于任何线性网络层(完全连接或进化),它为Lipschitz常数提供正式的保障,实施起来容易,而且效率高,并且可以与任何培训目标和优化方法相结合。事实上,我们的技术是同时实现所有这些特性的文献中的第一个。我们的主要贡献是以调整为基础的重力矩阵,保证每个网络层最多有一个Lipschitz常数,并导致学习的重力矩阵接近正方形。因此,我们称这些层为几乎垂直的Lipschitz常数,实施起来容易,而且与任何经过认证的精确度的图像分类研究证实,AOL层取得了与大多数现有方法一致的成果。然而,我们的主要贡献是,它们比较简单易于执行和更广泛地应用Lischetroformal oral oral oragemental oration orpalforpormation ormation.