Adversarial machine learning (AML) studies the adversarial phenomenon of machine learning, which may make inconsistent or unexpected predictions with humans. Some paradigms have been recently developed to explore this adversarial phenomenon occurring at different stages of a machine learning system, such as training-time adversarial attack (i.e., backdoor attack), deployment-time adversarial attack (i.e., weight attack), and inference-time adversarial attack (i.e., adversarial example). However, although these paradigms share a common goal, their developments are almost independent, and there is still no big picture of AML. In this work, we aim to provide a unified perspective to the AML community to systematically review the overall progress of this field. We firstly provide a general definition about AML, and then propose a unified mathematical framework to covering existing attack paradigms. According to the proposed unified framework, we can not only clearly figure out the connections and differences among these paradigms, but also systematically categorize and review existing works in each paradigm.
翻译:反对立机器学习(AML)研究机器学习的对抗性现象,这种现象可能对人类产生不一致或意外的预测,最近已经发展了一些范式,探索在机器学习系统不同阶段出现的这种对抗性现象,例如训练-时间对抗攻击(即后门攻击)、部署-时间对抗攻击(即重量攻击)和推论-对抗攻击(即对抗性攻击),然而,虽然这些范式有一个共同的目标,但它们的发展几乎是独立的,而且没有关于反洗钱的大图景。在这项工作中,我们的目标是向反洗钱界提供一个统一的观点,以便系统地审查该领域的总体进展。我们首先对反洗钱运动提出一个总的定义,然后提出一个统一的数学框架,以涵盖现有的攻击范式。根据拟议的统一框架,我们不仅可以清楚地找出这些范式之间的联系和差异,而且还可以系统地分类和审查每个范式中的现有工作。