Adversarial examples can easily degrade the classification performance in neural networks. Empirical methods for promoting robustness to such examples have been proposed, but often lack both analytical insights and formal guarantees. Recently, some robustness certificates have appeared in the literature based on system theoretic notions. This work proposes an incremental dissipativity-based robustness certificate for neural networks in the form of a linear matrix inequality for each layer. We also propose an equivalent spectral norm bound for this certificate which is scalable to neural networks with multiple layers. We demonstrate the improved performance against adversarial attacks on a feed-forward neural network trained on MNIST and an Alexnet trained using CIFAR-10.
翻译:反向实例可以很容易地降低神经网络的分类性能; 已经提出了促进这类实例的稳健性的经验方法,但往往缺乏分析见解和正式保障; 最近,基于系统理论概念的文献中出现了一些稳健性证书; 这项工作提出了神经网络以线性矩阵不平等形式对每一层进行递增的基于分离性的稳健性证书; 我们还为这一证书提出了一个对多层神经网络可扩缩的等光谱规范; 我们展示了在MNIST培训的供料神经网络和使用CIFAR-10培训的Alexnet上对抗性攻击的表现得到改善。