Deep Neural Networks (DNNs) are vulnerable to adversarial examples, which causes serious threats to security-critical applications. This motivated much research on providing mechanisms to make models more robust against adversarial attacks. Unfortunately, most of these defenses, such as gradient masking, are easily overcome through different attack means. In this paper, we propose MUTEN, a low-cost method to improve the success rate of well-known attacks against gradient-masking models. Our idea is to apply the attacks on an ensemble model which is built by mutating the original model elements after training. As we found out that mutant diversity is a key factor in improving success rate, we design a greedy algorithm for generating diverse mutants efficiently. Experimental results on MNIST, SVHN, and CIFAR10 show that MUTEN can increase the success rate of four attacks by up to 0.45.
翻译:深神经网络(DNNS)很容易受到对抗性的例子的伤害,这些例子对安全关键应用造成了严重威胁。这促使人们进行大量研究,研究提供机制,使模型对对抗性攻击更加强大。不幸的是,大多数这些防御,如梯度遮罩,很容易通过不同的攻击手段克服。在本论文中,我们建议采用低成本方法提高已知的对梯度模型的攻击成功率。我们的想法是,对由训练后变异原始模型元素所建立的组合模型进行攻击。我们发现变异体多样性是提高成功率的关键因素,我们设计了一种贪婪的算法,以便高效地产生不同的变异体。MWIS、SVHN和CIFAR10的实验结果显示,MIS、SVHN和CIFAR10的实验结果可以将四次攻击的成功率提高到0.45。