The modern open internet contains billions of public images of human faces across the web, especially on social media websites used by half the world's population. In this context, Face Recognition (FR) systems have the potential to match faces to specific names and identities, creating glaring privacy concerns. Adversarial attacks are a promising way to grant users privacy from FR systems by disrupting their capability to recognize faces. Yet, such attacks can be perceptible to human observers, especially under the more challenging black-box threat model. In the literature, the justification for the imperceptibility of such attacks hinges on bounding metrics such as $\ell_p$ norms. However, there is not much research on how these norms match up with human perception. Through examining and measuring both the effectiveness of recent black-box attacks in the face recognition setting and their corresponding human perceptibility through survey data, we demonstrate the trade-offs in perceptibility that occur as attacks become more aggressive. We also show how the $\ell_2$ norm and other metrics do not correlate with human perceptibility in a linear fashion, thus making these norms suboptimal at measuring adversarial attack perceptibility.
翻译:现代开放的互联网包含着数十亿个公众的网络面孔图像,特别是在世界上一半人口使用的社交媒体网站上。在这方面,脸部识别系统具有将脸部与具体姓名和身份相匹配的潜力,从而产生了明显的隐私问题。反向袭击通过干扰脸部识别能力,是给予用户隐私的极好的方法。然而,这种袭击对于人类观察者来说是显而易见的,特别是在更具挑战性的黑盒威胁模式下。在文献中,这种袭击的不可察觉性的理由取决于诸如$\ell_p$规范等约束性标准。然而,对于这些规范如何与人类认知相匹配,并没有进行多少研究。通过检查和测量最近对脸部识别环境中黑盒袭击的有效性及其通过调查数据对人的可感知性,我们展示了攻击越发激烈时发生的可感知性交易。我们还展示了美元=2美元的规范和其他衡量标准如何在直线方式上与人的可觉性不相关,从而使得这些规范在衡量对抗性攻击可觉性方面低于标准。