Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a robust classifier, while the follower problem is trying to generate adversarial samples. Unfortunately, such a bilevel problem is difficult to solve due to its highly complicated structure. This work proposes a new adversarial training method based on a generic learning-to-learn (L2L) framework. Specifically, instead of applying existing hand-designed algorithms for the inner problem, we learn an optimizer, which is parametrized as a convolutional neural network. At the same time, a robust classifier is learned to defense the adversarial attack generated by the learned optimizer. Experiments over CIFAR-10 and CIFAR-100 datasets demonstrate that L2L outperforms existing adversarial training methods in both classification accuracy and computational efficiency. Moreover, our L2L framework can be extended to generative adversarial imitation learning and stabilize the training.
翻译:Aversarial 培训为培训强大的神经网络提供了一个原则性方法。 从优化的角度来说,对抗性培训基本上是解决双级优化问题。 领导者问题正在试图学习一个强大的分类器, 而后续者问题正在试图生成对抗性样本。 不幸的是,这种双级问题由于结构非常复杂而难以解决。 这项工作提出了一个新的对抗性培训方法, 其基础是通用的从学到读( L2L) 框架。 具体地说, 我们不是应用现有的人工设计算法解决内部问题, 而是学习一个优化者, 因为它是一个平行的神经网络。 与此同时, 一个强大的分类器正在学习如何保护由学习的优化者引发的对抗性攻击。 对CIFAR- 10 和 CIFAR- 100 数据集的实验表明, L2L 将现有的对抗性培训方法在分类精度和计算效率两方面都比起来。 此外, 我们的L2L框架可以扩展至基因化的对立式模拟模拟学习和稳定培训。