Neural networks have been shown to be vulnerable against fault injection attacks. These attacks change the physical behavior of the device during the computation, resulting in a change of value that is currently being computed. They can be realized by various fault injection techniques, ranging from clock/voltage glitching to application of lasers to rowhammer. In this paper we explore the possibility to reverse engineer neural networks with the usage of fault attacks. SNIFF stands for sign bit flip fault, which enables the reverse engineering by changing the sign of intermediate values. We develop the first exact extraction method on deep-layer feature extractor networks that provably allows the recovery of the model parameters. Our experiments with Keras library show that the precision error for the parameter recovery for the tested networks is less than $10^{-13}$ with the usage of 64-bit floats, which improves the current state of the art by 6 orders of magnitude. Additionally, we discuss the protection techniques against fault injection attacks that can be applied to enhance the fault resistance.
翻译:神经网络被证明很容易受到过错注射攻击。 这些攻击改变了设备在计算过程中的物理行为, 导致目前正在计算的价值变化。 它们可以通过各种过错注射技术实现, 从钟/电压闪烁到激光应用到脱吊机。 在本文中, 我们探索了利用过错攻击来逆转工程神经网络的可能性。 SNIFF 代表了符号比特翻动故障, 通过改变中间值的标志, 使反向工程得以进行。 我们在深层特征提取网络上开发了第一个精确的提取方法, 从而可以找到恢复模型参数。 我们与 Keras 图书馆的实验显示, 测试网络参数恢复的精确误差小于 10 <unk> - 13 美元 }, 使用64 位浮标来改善当前艺术状态6 级。 此外, 我们讨论防止过错注入攻击的保护技术, 可以用来加强过错抵抗力。</s>