Indiscriminate data poisoning attacks, which add imperceptible perturbations to training data to maximize the test error of trained models, have become a trendy topic because they are thought to be capable of preventing unauthorized use of data. In this work, we investigate why these perturbations work in principle. We find that the perturbations of advanced poisoning attacks are almost \textbf{linear separable} when assigned with the target labels of the corresponding samples, which hence can work as \emph{shortcuts} for the learning objective. This important population property has not been unveiled before. Moreover, we further verify that linear separability is indeed the workhorse for poisoning attacks. We synthesize linear separable data as perturbations and show that such synthetic perturbations are as powerful as the deliberately crafted attacks. Our finding suggests that the \emph{shortcut learning} problem is more serious than previously believed as deep learning heavily relies on shortcuts even if they are of an imperceptible scale and mixed together with the normal features. This finding also suggests that pre-trained feature extractors would disable these poisoning attacks effectively.
翻译:不分青红皂白的数据中毒袭击在培训数据中增加了无法察觉的扰动,以尽量扩大经过训练的模型的测试错误,这已成为一个潮流话题,因为人们认为它们能够防止未经授权使用数据。在这项工作中,我们调查这些扰动原则上为什么起作用。我们发现,在指定相应样本的目标标签时,高级中毒袭击的扰动几乎是\ textbf{linearseparable} 。因此,这可以作为学习目标的深层次学习问题。这一重要的人口属性以前还没有被揭开。此外,我们进一步核实线性分离确实是中毒袭击的工马。我们把线性分离数据作为干扰合成扰动数据综合起来,并表明此类合成扰动与故意制造的攻击一样强大。我们的发现表明,与先前的特性提取器相比,问题更为严重,因为深层次的学习严重依赖捷径,即使它们具有不可察觉的规模,而且与正常特性混杂在一起。这还表明,预制的特性提取器会有效抑制这些中毒袭击。