Adversarial attacks against deep neural networks (DNNs) are continuously evolving, requiring increasingly powerful defense strategies. We develop a novel adversarial defense framework inspired by the adaptive immune system: the Robust Adversarial Immune-inspired Learning System (RAILS). Initializing a population of exemplars that is balanced across classes, RAILS starts from a uniform label distribution that encourages diversity and uses an evolutionary optimization process to adaptively adjust the predictive label distribution in a manner that emulates the way the natural immune system recognizes novel pathogens. RAILS' evolutionary optimization process explicitly captures the tradeoff between robustness (diversity) and accuracy (specificity) of the network, and represents a new immune-inspired perspective on adversarial learning. The benefits of RAILS are empirically demonstrated under eight types of adversarial attacks on a DNN adversarial image classifier for several benchmark datasets, including: MNIST; SVHN; CIFAR-10; and CIFAR-10. We find that PGD is the most damaging attack strategy and that for this attack RAILS is significantly more robust than other methods, achieving improvements in adversarial robustness by $\geq 5.62\%, 12.5\%$, $10.32\%$, and $8.39\%$, on these respective datasets, without appreciable loss of classification accuracy. Codes for the results in this paper are available at https://github.com/wangren09/RAILS.
翻译:对深神经网络的Adversarial攻击正在不断演变,需要越来越强大的防御战略。我们开发了一个由适应性免疫系统启发的新颖的对抗性防御框架:强力抗逆转录病毒激励学习系统(RAILS ), 初始化一个各年级平衡的外光成像器,RAILS 的优势在八类对DNN对抗性图像分析器的对抗性攻击下得到实证证明,包括:MNIST;SVHN;CIFAR-10;以及CIFAR-10。我们发现,PGD是破坏力最大的攻击战略,而针对这次攻击,RAILS的强力(多样性)和准确性(具体性)之间有着明显的平衡,并代表了对对抗性学习的新的免疫激励视角。RAILS的优势在八类对抗性攻击性攻击中得到了实证证明,包括:MNIST;SVHN;SHN;CIFAR-10;以及CIFAR-10。我们发现,PGT是最具破坏性的攻击战略,而针对这次攻击,RAILS的准确性(多样性/RABS)比其他方法要强力价值为5.62美元,通过10美元。