Deep learning models are often trained on distributed, webscale datasets crawled from the internet. In this paper, we introduce two new dataset poisoning attacks that intentionally introduce malicious examples to a model's performance. Our attacks are immediately practical and could, today, poison 10 popular datasets. Our first attack, split-view poisoning, exploits the mutable nature of internet content to ensure a dataset annotator's initial view of the dataset differs from the view downloaded by subsequent clients. By exploiting specific invalid trust assumptions, we show how we could have poisoned 0.01% of the LAION-400M or COYO-700M datasets for just $60 USD. Our second attack, frontrunning poisoning, targets web-scale datasets that periodically snapshot crowd-sourced content -- such as Wikipedia -- where an attacker only needs a time-limited window to inject malicious examples. In light of both attacks, we notify the maintainers of each affected dataset and recommended several low-overhead defenses.
翻译:深层学习模型通常在分布式、 网络规模数据集上接受培训。 在本文中, 我们引入了两起新的数据集中毒袭击, 故意将恶意例子引入模型的性能。 我们的攻击立刻就非常实用, 今天可能毒死10个流行数据集。 我们的第一次袭击, 不同观点的中毒, 利用互联网内容的可变性质来确保数据集标识器对数据集的初步观点与随后客户下载的版本不同。 通过利用特定的无效信任假设, 我们展示了我们如何用60美元来毒死LAION- 400M或COYO- 700M数据集中的0.01%。 我们的第二次袭击, 前期中毒, 目标是网络规模数据集, 定期截取众源内容, 如维基百科, 攻击者只需要一个有时间限制的窗口来输入恶意实例。 在这两起袭击中, 我们向每个受影响的数据集的维护者通报, 并推荐了几个低超标的防御。