In this paper, we propose a framework of filter-based ensemble of deep neuralnetworks (DNNs) to defend against adversarial attacks. The framework builds an ensemble of sub-models -- DNNs with differentiated preprocessing filters. From the theoretical perspective of DNN robustness, we argue that under the assumption of high quality of the filters, the weaker the correlations of the sensitivity of the filters are, the more robust the ensemble model tends to be, and this is corroborated by the experiments of transfer-based attacks. Correspondingly, we propose a principle that chooses the specific filters with smaller Pearson correlation coefficients, which ensures the diversity of the inputs received by DNNs, as well as the effectiveness of the entire framework against attacks. Our ensemble models are more robust than those constructed by previous defense methods like adversarial training, and even competitive with the classical ensemble of adversarial trained DNNs under adversarial attacks when the attacking radius is large.
翻译:在本文中,我们提出了一个基于过滤的深神经网络联合体(DNNs)框架,以抵御对抗性攻击。这个框架构建了一组子模型 -- -- 带有不同预处理过滤器的DNNs。从DNN的稳健性理论角度来看,我们认为,在过滤器质量高的假设下,过滤器的灵敏度的关联性越弱,组合模型就越强,而且这种模型往往得到基于转移的攻击实验的证实。相应地,我们提出了一条原则,即选择具有较小皮尔逊相关系数的具体过滤器,以确保DNNs收到的信息的多样性,以及整个框架对攻击的有效性。我们的组合模型比以前防御方法(如对抗性训练)所构建的模式更强大,甚至与在攻击半径大的情况下在对抗性攻击性攻击性攻击性攻击下经过训练的敌对性DNNS的经典组合竞争。