Most recent studies have shown several vulnerabilities to attacks with the potential to jeopardize the integrity of the model, opening in a few recent years a new window of opportunity in terms of cyber-security. The main interest of this paper is directed towards data poisoning attacks involving label-flipping, this kind of attacks occur during the training phase, being the aim of the attacker to compromise the integrity of the targeted machine learning model by drastically reducing the overall accuracy of the model and/or achieving the missclassification of determined samples. This paper is conducted with intention of proposing two new kinds of data poisoning attacks based on label-flipping, the targeted of the attack is represented by a variety of machine learning classifiers dedicated for malware detection using mobile exfiltration data. With that, the proposed attacks are proven to be model-agnostic, having successfully corrupted a wide variety of machine learning models; Logistic Regression, Decision Tree, Random Forest and KNN are some examples. The first attack is performs label-flipping actions randomly while the second attacks performs label flipping only one of the 2 classes in particular. The effects of each attack are analyzed in further detail with special emphasis on the accuracy drop and the misclassification rate. Finally, this paper pursuits further research direction by suggesting the development of a defense technique that could promise a feasible detection and/or mitigation mechanisms; such technique should be capable of conferring a certain level of robustness to a target model against potential attackers.
翻译:最近的最近一些研究显示,在有可能破坏模型完整性的攻击中,有一些易受攻击的脆弱性,有可能破坏模型的完整性,在最近几年里打开了网络安全方面的新机会之新机会之窗,本文主要关注的是涉及标签涂贴标签的数据中毒攻击,这类攻击发生在培训阶段,攻击者的目的是通过大幅度降低模型的总体准确性,并/或实现确定样品的误分类,损害定向机器学习模型的完整性,从而大大降低模型的总体准确性,并/或使确定样品的样本的分类错误分类,从而大大降低模型的总体准确性,从而大大降低模型的总体准确性,从而损害定向,从而有可能破坏模型的完整性,而第二次攻击的目的在于根据标签-泄漏,攻击的目标来自一系列专门利用移动的清除数据,在网络安全网络安全网络安全网络安全网络中,专门致力于利用移动的清除数据进行恶意软件检测的机器学习分类器,而本文的主要兴趣是数据中毒攻击,在培训过程中,由于成功地腐蚀了各种各样的机器学习模型模型,攻击的目的是损害定向、决策树、某些森林和KNNNN就是一些例子;第一次攻击是任意进行标签-级行动,同时随机进行标定级,第二次攻击只进行标定级,而第二次攻击只进行标定的标定点,而可能只只仅翻两个类,攻击的目标只是一个。每次攻击的影响是进一步详细分析,因为每次攻击对使用移动标标标点的影响,用移动分析,用移动的精测测,然后进一步分析,进一步分析,还分析。最后的