以基准衡量对立图像扭曲的强力 (Benchmarking Robustness to Adversarial Image Obfuscations)

Florian Stimberg,Ayan Chakrabarti,Chun-Ta Lu,Hussein Hazimeh,Otilia Stretcu,Wei Qiao,Yintao Liu,Merve Kaya,Cyrus Rashtchian,Ariel Fuxman,Mehmet Tek,Sven Gowal

Automated content filtering and moderation is an important tool that allows online platforms to build striving user communities that facilitate cooperation and prevent abuse. Unfortunately, resourceful actors try to bypass automated filters in a bid to post content that violate platform policies and codes of conduct. To reach this goal, these malicious actors may obfuscate policy violating images (e.g. overlay harmful images by carefully selected benign images or visual patterns) to prevent machine learning models from reaching the correct decision. In this paper, we invite researchers to tackle this specific issue and present a new image benchmark. This benchmark, based on ImageNet, simulates the type of obfuscations created by malicious actors. It goes beyond ImageNet-$\textrm{C}$ and ImageNet-$\bar{\textrm{C}}$ by proposing general, drastic, adversarial modifications that preserve the original content intent. It aims to tackle a more common adversarial threat than the one considered by $\ell_p$-norm bounded adversaries. We evaluate 33 pretrained models on the benchmark and train models with different augmentations, architectures and training methods on subsets of the obfuscations to measure generalization. We hope this benchmark will encourage researchers to test their models and methods and try to find new approaches that are more robust to these obfuscations.

翻译：自动内容过滤和节制是一个重要的工具,使在线平台能够建设促进合作和防止滥用的用户群。不幸的是,机智的行为体试图绕过自动过滤器,试图张贴违反平台政策和行为守则的内容。为实现这一目标,这些恶意行为体可能会回避违反图像的政策(例如,通过仔细选择的良性图像或视觉模式,覆盖有害图像),以防止机器学习模式达成正确的决定。在本文中,我们请研究人员处理这一具体问题,并提出新的图像基准。根据图像网,这一基准模拟恶意行为体制造的混淆类型。它超越了图像网-$\ textrm{C}$和图像网-$\bar textrm{C}$。为了达到这个目标,这些恶意行为体可能会回避违反图像的政策(例如,通过精心选择的良性图像或视觉模式,覆盖有害图像),以防止机器学习模式达成正确的决定。我们邀请研究人员处理这一具体问题,并提出新的图像基准中33个预先培训的模型,并用不同的增强能力、架构和培训方法模拟恶意行为体制造的混乱类型。通过提议一般的测试方法,我们将会找到新的测试方法。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【斯坦福大学】对抗性表征主动学习，Adversarial Representation Active Learning

专知会员服务

45+阅读 · 2019年12月20日