分类器鲁棒性强化：基于测试时转换的方法 (Classifier Robustness Enhancement Via Test-Time Transformation)

It has been recently discovered that adversarially trained classifiers exhibit an intriguing property, referred to as perceptually aligned gradients (PAG). PAG implies that the gradients of such classifiers possess a meaningful structure, aligned with human perception. Adversarial training is currently the best-known way to achieve classification robustness under adversarial attacks. The PAG property, however, has yet to be leveraged for further improving classifier robustness. In this work, we introduce Classifier Robustness Enhancement Via Test-Time Transformation (TETRA) -- a novel defense method that utilizes PAG, enhancing the performance of trained robust classifiers. Our method operates in two phases. First, it modifies the input image via a designated targeted adversarial attack into each of the dataset's classes. Then, it classifies the input image based on the distance to each of the modified instances, with the assumption that the shortest distance relates to the true class. We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets. We also empirically demonstrate that TETRA can boost the accuracy of any differentiable adversarial training classifier across a variety of attacks, including ones unseen at training. Specifically, applying TETRA leads to substantial improvement of up to $+23\%$, $+20\%$, and $+26\%$ on CIFAR10, CIFAR100, and ImageNet, respectively.

翻译：最近的研究发现，针对对抗样本进行训练的分类器具有一种称为感知对齐梯度（PAG）的有趣属性。PAG 表示此类分类器的梯度具有与人类感知相一致的有意义结构。针对对抗攻击进行训练是目前实现分类器鲁棒性的最佳方式。然而，PAG 属性还没有用于进一步提高分类器鲁棒性。在本文中，我们介绍了分类器鲁棒性强化：基于测试时转换的方法（TETRA），它是一种利用 PAG 的全新防御方法，可以增强训练的鲁棒分类器的性能。我们的方法分为两个阶段。首先，它通过针对每个数据集类别进行一种特定的针对性对抗攻击来修改输入图像。然后，它根据输入图像到每个修改实例以及其类别之间的距离进行分类，假设最短距离对应于实际类别。我们证明了所提出的方法在各种防御方法、分类器结构和数据集上都可以实现最先进的结果，并对此进行了广泛的实验验证。我们还通过实验证明，TETRA 可以提高各种对抗训练分类器的准确性，包括在训练中未见过的攻击类型。具体地，应用 TETRA 可以将 CIFAR10、CIFAR100 和 ImageNet 上的准确度显著提高高达 $+23\%$、$+20\%$ 和 $+26\%$。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【深度迁移学习在图像分类中的应用综述】Deep transfer learning for image classification: a survey

专知会员服务

25+阅读 · 2022年5月24日

【AAAI2022】基于协调域编码器和配对分类器的多源域适应

专知会员服务

17+阅读 · 2022年2月9日

【AAAI2022】学会学习可迁移攻击

专知会员服务

16+阅读 · 2021年12月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日