Sponge examples are test-time inputs carefully optimized to increase energy consumption and latency of neural networks when deployed on hardware accelerators. In this work, we are the first to demonstrate that sponge examples can also be injected at training time, via an attack that we call sponge poisoning. This attack allows one to increase the energy consumption and latency of machine-learning models indiscriminately on each test-time input. We present a novel formalization for sponge poisoning, overcoming the limitations related to the optimization of test-time sponge examples, and show that this attack is possible even if the attacker only controls a few model updates; for instance, if model training is outsourced to an untrusted third-party or distributed via federated learning. Our extensive experimental analysis shows that sponge poisoning can almost completely vanish the effect of hardware accelerators. We also analyze the activations of poisoned models, identifying which components are more vulnerable to this attack. Finally, we examine the feasibility of countermeasures against sponge poisoning to decrease energy consumption, showing that sanitization methods may be overly expensive for most of the users.
翻译:海绵样本是经过仔细优化以增加硬件加速器部署时神经网络能源消耗和延迟的测试时间输入。在这项工作中,我们首次展示了海绵样本可以在训练时通过一种被称为海绵污染的攻击注入。这种攻击允许用户将机器学习模型在每次测试时间输入上的能源消耗和延迟无差别地增加。我们提供了海绵污染的新型形式化方法,克服了与测试时间海绵示例优化相关的限制,并证明了即使攻击者仅控制几个模型更新(例如,如果模型训练被外包给不受信任的第三方或通过联邦学习进行分发),这种攻击也是可能的。我们广泛的实验分析表明,海绵污染几乎可以完全消除硬件加速器的作用。我们还分析了被污染模型的激活,识别哪些组件更易受到这种攻击。最后,我们研究了海绵污染的对策可行性以减少能源消耗,表明清洗方法对大多数用户来说可能过于昂贵。