Data poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training dataset to degrade the performance of machine learning models. As recent work has shown, such Denial-of-Service (DoS) data poisoning attacks are highly effective. To mitigate this threat, we propose a new approach of detecting DoS poisoned instances. In comparison to related work, we deviate from clustering and anomaly detection based approaches, which often suffer from the curse of dimensionality and arbitrary anomaly threshold selection. Rather, our defence is based on extracting information from the training data in such a generalized manner that we can identify poisoned samples based on the information present in the unpoisoned portion of the data. We evaluate our defence against two DoS poisoning attacks and seven datasets, and find that it reliably identifies poisoned instances. In comparison to related work, our defence improves false positive / false negative rates by at least 50%, often more.
翻译:数据中毒是对机器学习和数据驱动技术的最相关的安全威胁之一。由于许多应用都依赖不可信的培训数据,攻击者很容易地将恶意样本编成工具,并将它们输入培训数据集,以降低机器学习模型的性能。正如最近的工作所显示的那样,这种拒绝服务数据中毒袭击非常有效。为了减轻这种威胁,我们建议采用一种新的方法来探测DoS中毒案例。与相关工作相比,我们偏离了基于集群和异常检测的方法,这些方法往往受到维度和任意异常临界值的诅咒。相反,我们的防御基于从培训数据中提取信息,以便根据数据中未受污染部分中的信息来识别中毒样本。我们评估我们针对两次多斯中毒袭击和七次数据集的防御,并发现它可靠地识别了中毒案例。与相关工作相比,我们的防御使虚假正/虚假负率至少提高50%,通常增加50%。