The transferability of adversarial examples across deep neural networks (DNNs) is the crux of many black-box attacks. Many prior efforts have been devoted to improving the transferability via increasing the diversity in inputs of some substitute models. In this paper, by contrast, we opt for the diversity in substitute models and advocate to attack a Bayesian model for achieving desirable transferability. Deriving from the Bayesian formulation, we develop a principled strategy for possible finetuning, which can be combined with many off-the-shelf Gaussian posterior approximations over DNN parameters. Extensive experiments have been conducted to verify the effectiveness of our method, on common benchmark datasets, and the results demonstrate that our method outperforms recent state-of-the-arts by large margins (roughly 19% absolute increase in average attack success rate on ImageNet), and, by combining with these recent methods, further performance gain can be obtained. Our code: https://github.com/qizhangli/MoreBayesian-attack.
翻译:纵深神经网络(DNNs)的对抗性实例的可转移性是许多黑箱袭击的症结所在。许多先前的努力都致力于通过增加某些替代模型投入的多样性来改善可转移性。与此相反,我们选择了替代模型的多样性,并主张采用贝叶斯模式,以实现可取的可转移性。从巴伊西亚的配方出发,我们制定了一个原则性战略,以可能进行微调,这种战略可以与许多现成高山后后近似率超过DNN参数相结合。已经进行了广泛的实验,以核实我们的方法的有效性,即共同的基准数据集,结果显示,我们的方法大大超越了最近的状况(图像网络的平均攻击成功率增加约19% ),并且通过结合这些最新的方法,可以取得进一步的性能收益。我们的代码:https://github.com/qizhangli/MoreBayesian-re。