Model ensembles have long been used in machine learning to reduce the variance in individual model predictions, making them more robust to input perturbations. Pseudo-ensemble methods like dropout have also been commonly used in deep learning models to improve generalization. However, the application of these techniques to improve neural networks' robustness against input perturbations remains underexplored. We introduce Kernel Average Pool (KAP), a new neural network building block that applies the mean filter along the kernel dimension of the layer activation tensor. We show that ensembles of kernels with similar functionality naturally emerge in convolutional neural networks equipped with KAP and trained with backpropagation. Moreover, we show that when combined with activation noise, KAP models are remarkably robust against various forms of adversarial attacks. Empirical evaluations on CIFAR10, CIFAR100, TinyImagenet, and Imagenet datasets show substantial improvements in robustness against strong adversarial attacks such as AutoAttack that are on par with adversarially trained networks but are importantly obtained without training on any adversarial examples.
翻译:长期以来,在机器学习中一直使用模型组合,以减少个人模型预测的差异,使其更能产生干扰。在深层学习模型中也经常使用诸如辍学等多式混合方法来改进一般化。然而,运用这些技术来改善神经网络对输入干扰的稳健性,这些技术的应用仍然未得到充分探讨。我们引入了一个新的神经网络构件 " 内核平均池 " (KAP),该元件沿层内层激活振动声器的内层尺寸应用平均过滤器。我们显示,具有类似功能的内核组件在配备了KAP并受过反向调整训练的演进神经网络中自然出现。此外,我们表明,在与激活噪音相结合的情况下,KAP模型对各种形式的对抗性攻击非常有力。我们对CIFAR10、CIFAR100、TinyImagenet和图像网数据集进行了实证性评估,显示在抵御强大的对抗性攻击方面,如AutAtack与敌对性训练的网络一样,这种攻击的坚硬性反应能力有了很大的改进。