Attacks from adversarial machine learning (ML) have the potential to be used "for good": they can be used to run counter to the existing power structures within ML, creating breathing space for those who would otherwise be the targets of surveillance and control. But most research on adversarial ML has not engaged in developing tools for resistance against ML systems. Why? In this paper, we review the broader impact statements that adversarial ML researchers wrote as part of their NeurIPS 2020 papers and assess the assumptions that authors have about the goals of their work. We also collect information about how authors view their work's impact more generally. We find that most adversarial ML researchers at NeurIPS hold two fundamental assumptions that will make it difficult for them to consider socially beneficial uses of attacks: (1) it is desirable to make systems robust, independent of context, and (2) attackers of systems are normatively bad and defenders of systems are normatively good. That is, despite their expressed and supposed neutrality, most adversarial ML researchers believe that the goal of their work is to secure systems, making it difficult to conceptualize and build tools for disrupting the status quo.
翻译:来自对抗机器学习(ML)的攻击有可能被“永久地”使用:这些攻击可以被用来与ML内部现有的权力结构背道而驰,为那些本来会成为监视和控制对象的人创造呼吸空间。但是,大多数关于对抗ML的研究并没有开发对抗ML系统的工具。为什么?在本文中,我们审查了敌对ML研究人员作为其NeurIPS 2020年论文的一部分所撰写的更广泛的影响声明,并评估了作者对其工作目标的假设。我们还收集了作者如何更全面地看待其工作影响的信息。我们发现,NeurIPS的大多数对抗ML研究人员持有两种基本假设,使他们难以考虑对社会有利的攻击使用:(1) 使系统强大、独立于环境,(2) 攻击系统者在规范上是坏的,捍卫系统者在规范上是好的。 这正是大多数对抗ML研究人员尽管表达和假定中立,但认为他们的工作目标是确保系统的安全,使其难以概念化和构建破坏现状的工具。