A spectral approximation of a Boolean function is proposed for approximating the decision boundary of an ensemble of Deep Neural Networks (DNNs) solving two-class pattern recognition problems. The Walsh combination of relatively weak DNN classifiers is shown experimentally to be capable of detecting adversarial attacks. By observing the difference in Walsh coefficient approximation between clean and adversarial images, it appears that transferability of attack may be used for detection. Approximating the decision boundary may also aid in understanding the learning and transferability properties of DNNs. While the experiments here use images, the proposed approach of modelling two-class ensemble decision boundaries could in principle be applied to any application area.
翻译:提出布尔函数的光谱近似值,以近似深神经网络合体(DNN)解决双级模式识别问题的决定边界。相对弱的DNN分类者的沃尔什组合实验显示能够发现对抗性攻击。通过观察清洁图像和对抗性图像之间沃尔什系数近似值的差异,似乎可以使用攻击的可转移性来探测。适应决定边界也有助于了解DNN的学习和可转移性。虽然这里的实验使用图像,但拟议的模拟两级共同决定界限的方法原则上可以适用于任何应用领域。