While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.
翻译:虽然在强有力的深层学习中的现有工作侧重于小型像素水平$@ell_p$p$规范的扰动,但这也许不能说明在几个现实世界环境中遇到的扰动。在很多这类情况下,虽然测试数据可能无法提供,但可能知道关于扰动类型(如轮换程度不明)的广泛规格。我们认为,在一个无形的测试领域,这种偏差可能并不完全为人所知,但偏离了培训领域。虽然这种偏差可能并不完全为人所知,但其广义定性是先验性的。我们建议采用对抗性培训方法,学会生成新的样本,以便最大限度地将分类器暴露到属性空间,而不能从测试领域获得数据。我们的对抗性培训可以解决一个微量最大优化问题,内部最大化将产生对抗性扰动,通过优化内部最大化产生的对抗争扰动损失,外部最小值查找模型模型。我们展示了我们三种经过训练的深度网络的可应用性新样本,从而使得我们自然发生的对精锐性网络进行最大程度的精确性变换。我们利用了三类经过训练的电压性网络,我们自然的深度变动的精确度,从而可以显示我们不断变动的深度变动的图像。