女巫的Brew: 工业规模数据通过梯度匹配中毒 (Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching)

Data Poisoning attacks modify training data to maliciously control a model trained on such data. In this work, we focus on targeted poisoning attacks which cause a reclassification of an unmodified test image and as such breach model integrity. We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data. Previous poisoning attacks against deep neural networks in this setting have been limited in scope and success, working only in simplified settings or being prohibitively expensive for large datasets. The central mechanism of the new attack is matching the gradient direction of malicious examples. We analyze why this works, supplement with practical considerations. and show its threat to real-world practitioners, finding that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset. Finally we demonstrate the limitations of existing defensive strategies against such an attack, concluding that data poisoning is a credible threat, even for large-scale deep learning systems.

翻译：在这项工作中,我们把重点放在了定向中毒袭击上,这些袭击导致对未经修改的测试图像进行重新分类,并导致这种破坏模式的完整性。我们考虑的是一种特别恶意中毒袭击,既“从零到零”又“清洁标签”,这意味着我们分析的是一种袭击成功地针对新的随机初始化模型,而且几乎是人类无法察觉的,而这一切只是干扰了培训数据中的一小部分。在这个环境中,以前对深神经网络的中毒袭击在范围和成功方面都很有限,只是在简化的环境下工作,或者对大型数据集来说费用太高。新袭击的中心机制是匹配恶意实例的梯度方向。我们分析为什么这种袭击起作用,用实际的考虑因素加以补充。并展示其对现实世界从业者的威胁,发现这是在全尺寸、有毒的图像网络数据集上从刮伤中训练的现代深网络中导致定向分类错误的第一个中毒方法。最后,我们证明现有防范这种袭击的策略的局限性,结论是数据中毒是一种可信的威胁,即使对于大规模深层学习系统也是如此。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/