Neural network quantization has become increasingly popular due to efficient memory consumption and faster computation resulting from bitwise operations on the quantized networks. Even though they exhibit excellent generalization capabilities, their robustness properties are not well-understood. In this work, we systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrate that these quantized models suffer from gradient vanishing issues and show a fake sense of robustness. By attributing gradient vanishing to poor forward-backward signal propagation in the trained network, we introduce a simple temperature scaling approach to mitigate this issue while preserving the decision boundary. Despite being a simple modification to existing gradient based adversarial attacks, experiments on multiple image classification datasets with multiple network architectures demonstrate that our temperature scaled attacks obtain near-perfect success rate on quantized networks while outperforming original attacks on adversarially trained models as well as floating-point networks. Code is available at https://github.com/kartikgupta-at-anu/attack-bnn.
翻译:由于高效的记忆消耗和在量化网络上的比对式操作导致更快的计算,神经网络的量化越来越受欢迎。尽管它们表现出极强的普及能力,但它们的稳健性特性并没有得到很好理解。在这项工作中,我们系统地研究量化网络对基于梯度的对抗性攻击的稳健性,并表明这些量化模型存在渐渐渐消失的问题,并显示出一种虚假的稳健感。通过将梯度消失归咎于在经过培训的网络中落后的前向后信号传播,我们采用了简单的温度缩放方法来缓解这一问题,同时维护决定边界。尽管对基于现有对抗性攻击的梯度进行了简单的修改,但对多个网络结构的多重图像分类数据集的实验表明,我们规模的温度规模攻击在四分化网络上取得了近乎于效果的成功率,同时超过了对对抗性训练模型和浮动点网络的最初攻击。代码可在https://github.com/kartikgupta-at-anu/ amerction-bnn上查阅。