An established way to improve the transferability of black-box evasion attacks is to craft the adversarial examples on an ensemble-based surrogate to increase diversity. We argue that transferability is fundamentally related to uncertainty. Based on a state-of-the-art Bayesian Deep Learning technique, we propose a new method to efficiently build a surrogate by sampling approximately from the posterior distribution of neural network weights, which represents the belief about the value of each parameter. Our extensive experiments on ImageNet, CIFAR-10 and MNIST show that our approach improves the success rates of four state-of-the-art attacks significantly (up to 83.2 percentage points), in both intra-architecture and inter-architecture transferability. On ImageNet, our approach can reach 94% of success rate while reducing training computations from 11.6 to 2.4 exaflops, compared to an ensemble of independently trained DNNs. Our vanilla surrogate achieves 87.5% of the time higher transferability than three test-time techniques designed for this purpose. Our work demonstrates that the way to train a surrogate has been overlooked, although it is an important element of transfer-based attacks. We are, therefore, the first to review the effectiveness of several training methods in increasing transferability. We provide new directions to better understand the transferability phenomenon and offer a simple but strong baseline for future work.
翻译:改进黑箱规避攻击可转移性的一个既定方法是,在以混合为基础的替代工具上,为增加多样性而设计对抗性的例子。我们争辩说,可转移性从根本上与不确定性有关。根据一种先进的巴伊西亚深层学习技术,我们提出了一种新的方法,通过对神经网络重量的后体分布进行大致抽样来有效建立替代性,这代表了对每个参数价值的信念。我们在图像网、CIFAR-10和MNIST上的广泛实验表明,我们的方法大大改善了四个最先进的攻击的成功率(最高达83.2个百分点),这四个最先进的攻击在结构内部和结构间可转移性两方面都具有根本的关联性。在图像网上,我们的方法可以达到94%的成功率,同时将培训计算从11.6到2.4 外形网络重量,这代表了对独立训练的DNN的共性。我们的Vanilla 替代品在时间转移率上达到了87.5%,但比为这一目的设计的三个试验时间技术要高得多。我们的工作表明,我们如何训练一个更简单的转让方法,因此,我们如何训练一个重要的转让方法就是改进了攻击的可转让性。