The adversarial input generation problem has become central in establishing the robustness and trustworthiness of deep neural nets, especially when they are used in safety-critical application domains such as autonomous vehicles and precision medicine. This is also practically challenging for multiple reasons-scalability is a common issue owing to large-sized networks, and the generated adversarial inputs often lack important qualities such as naturalness and output-impartiality. We relate this problem to the task of patching neural nets, i.e. applying small changes in some of the network$'$s weights so that the modified net satisfies a given property. Intuitively, a patch can be used to produce an adversarial input because the effect of changing the weights can also be brought about by changing the inputs instead. This work presents a novel technique to patch neural networks and an innovative approach of using it to produce perturbations of inputs which are adversarial for the original net. We note that the proposed solution is significantly more effective than the prior state-of-the-art techniques.
翻译:在确定深神经网的坚固性和可信赖性方面,特别是当它们被用于自主车辆和精密药品等安全关键应用领域时,对抗性投入生成问题已成为核心,由于网络规模大,由于多种原因可扩展性是一个共同的问题,而由此产生的对抗性投入往往缺乏诸如自然性和产出公正性等重要品质。我们把这一问题与修补神经网的任务联系起来,即对网络的某些美元重量进行微小的改动,使修改后的网络能够满足给定的属性。直觉地说,一个补丁可以用来产生对抗性投入,因为改变重量的效果也可以通过改变投入而产生。这项工作是一种修补神经网的新技术,并且是一种创新的方法,即利用它来制造对原始网络进行对抗的投入的扰动性。我们注意到,拟议的解决方案比以前最先进的技术要有效得多。