关于人工情报的毒物攻击和防御:调查 (Poisoning Attacks and Defenses on Artificial Intelligence: A Survey)

Machine learning models have been widely adopted in several fields. However, most recent studies have shown several vulnerabilities from attacks with a potential to jeopardize the integrity of the model, presenting a new window of research opportunity in terms of cyber-security. This survey is conducted with a main intention of highlighting the most relevant information related to security vulnerabilities in the context of machine learning (ML) classifiers; more specifically, directed towards training procedures against data poisoning attacks, representing a type of attack that consists of tampering the data samples fed to the model during the training phase, leading to a degradation in the models accuracy during the inference phase. This work compiles the most relevant insights and findings found in the latest existing literatures addressing this type of attacks. Moreover, this paper also covers several defense techniques that promise feasible detection and mitigation mechanisms, capable of conferring a certain level of robustness to a target model against an attacker. A thorough assessment is performed on the reviewed works, comparing the effects of data poisoning on a wide range of ML models in real-world conditions, performing quantitative and qualitative analyses. This paper analyzes the main characteristics for each approach including performance success metrics, required hyperparameters, and deployment complexity. Moreover, this paper emphasizes the underlying assumptions and limitations considered by both attackers and defenders along with their intrinsic properties such as: availability, reliability, privacy, accountability, interpretability, etc. Finally, this paper concludes by making references of some of main existing research trends that provide pathways towards future research directions in the field of cyber-security.

翻译：在几个领域广泛采用了机器学习模式,然而,最近的一些研究显示,在攻击中存在一些弱点,这些攻击有可能危及模型的完整性,从网络安全角度提供了新的研究机会窗口;这项调查的主要目的是在机器学习(ML)分类人员的背景下突出与安全脆弱性有关的最相关信息;更具体地说,是针对数据中毒袭击的培训程序,这是在培训阶段篡改提供给模型的数据样本的一种攻击,导致模型在推断阶段的准确性下降;这项工作汇集了针对这类袭击的现有最新文献中发现的最相关的洞察力和调查结果;此外,本文件还涵盖一些国防技术,这些技术有望在机器学习(ML)分类人员中突出与安全脆弱性有关的最相关信息;更具体地说,这些技术针对的是针对攻击者的目标模型;对所审查的工程进行彻底评估,比较数据中毒对现实世界条件下各种ML模型的影响,进行定量和定性分析;本文分析了每种方法的主要特征,包括业绩衡量标准、要求超光分辨率测量和缓解机制的可操作性趋势;最后是,本文强调其基本可靠性和精确性,最后是文件的准确性。