Given the volume of data needed to train modern machine learning models, external suppliers are increasingly used. However, incorporating external data poses data poisoning risks, wherein attackers manipulate their data to degrade model utility or integrity. Most poisoning defenses presume access to a set of clean data (or base set). While this assumption has been taken for granted, given the fast-growing research on stealthy poisoning attacks, a question arises: can defenders really identify a clean subset within a contaminated dataset to support defenses? This paper starts by examining the impact of poisoned samples on defenses when they are mistakenly mixed into the base set. We analyze five defenses and find that their performance deteriorates dramatically with less than 1% poisoned points in the base set. These findings suggest that sifting out a base set with high precision is key to these defenses' performance. Motivated by these observations, we study how precise existing automated tools and human inspection are at identifying clean data in the presence of data poisoning. Unfortunately, neither effort achieves the precision needed. Worse yet, many of the outcomes are worse than random selection. In addition to uncovering the challenge, we propose a practical countermeasure, Meta-Sift. Our method is based on the insight that existing attacks' poisoned samples shifts from clean data distributions. Hence, training on the clean portion of a dataset and testing on the corrupted portion will result in high prediction loss. Leveraging the insight, we formulate a bilevel optimization to identify clean data and further introduce a suite of techniques to improve efficiency and precision. Our evaluation shows that Meta-Sift can sift a clean base set with 100% precision under a wide range of poisoning attacks. The selected base set is large enough to give rise to successful defenses.
翻译:考虑到培训现代机器学习模型所需的数据数量,外部供应商被越来越多地使用。然而,纳入外部数据带来了数据中毒风险,攻击者操纵其数据以降低模型效用或完整性。大多数中毒防御假设可以获取一组干净数据(或基准集 ) 。鉴于对隐性中毒袭击的研究迅速增加,这一假设是理所当然的,因此产生了一个问题:捍卫者能否真正在被污染的数据集中确定一个干净的子集以支持防御?本文的出发点是,在被污染的样本被错误地混合到基准集时,检查这些样本对防御系统的影响。我们分析了五种防御系统,发现其性能急剧恶化,在基础集中只有不到1%的中毒点。这些结论表明,筛选一个精准的基数(或基准集)是这些防御系统的业绩的关键。我们研究现有的自动工具和人体检查是如何在存在数据中毒的情况下找出干净的数据。不幸的是,没有做出更精确的努力,但更糟糕的是,许多结果比随机选择的要差。除了发现挑战外,我们提议在实际的精确度上进行精确度的反向基础的精确度上进行精确度变化。 精确度,Met-SI-SI-S成功的精确度显示,我们的数据将用一个更精确的精确度测试到现有的数据分析,一个更精确度,我们的现有数据将用一个更精确度。一种方法是用来的精确度。我们用一个更精确的精确的精确性的数据。我们用一种方法, 一种方法是用来在现有的一种方法。在现有的数据进行一个更精确度的精确度的精确度的精确度。