While existing work in robust deep learning has focused on small pixel-level norm-based perturbations, this may not account for perturbations encountered in several real-world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.
翻译:虽然在强有力的深层学习中的现有工作侧重于小型像素级的基于规范的扰动,但这也许不能说明在几个现实世界环境中遇到的扰动。在许多这类情况下,虽然测试数据可能无法提供,但可能会知道关于扰动类型(如轮换程度未知)的广泛规格。我们认为,在这种设置中,人们期望在非i.d.但与培训领域不同的无形测试领域有稳健性。虽然这种偏差可能不完全为人所知,但从属性的角度来看,其广义的深度定性是先验的。我们建议采用一种对抗性培训方法,即学会生成新的样本,以便最大限度地将分类器暴露到属性空间,而无需查阅测试领域的数据。我们的对抗性培训培训解决了一个微量轴优化问题,内部最大化会产生对抗性扰动性扰动,而外部最小化的发现模型参数则通过优化对内部最大化产生的对抗性扰动方法的损耗。我们提出的对三种自然发生的内压式的对立性评估方法,即利用我们经过训练的常规性网络,使我们的常规性变动,从而显示我们正在发生的常规性变形变形变。