反后门学习:关于中毒数据的培训清洁模式 (Anti-Backdoor Learning: Training Clean Models on Poisoned Data)

Backdoor attack has emerged as a major security threat to deep neural networks (DNNs). While existing defense methods have demonstrated promising results on detecting or erasing backdoors, it is still not clear whether robust training methods can be devised to prevent the backdoor triggers being injected into the trained model in the first place. In this paper, we introduce the concept of \emph{anti-backdoor learning}, aiming to train \emph{clean} models given backdoor-poisoned data. We frame the overall learning process as a dual-task of learning the \emph{clean} and the \emph{backdoor} portions of data. From this view, we identify two inherent characteristics of backdoor attacks as their weaknesses: 1) the models learn backdoored data much faster than learning with clean data, and the stronger the attack the faster the model converges on backdoored data; 2) the backdoor task is tied to a specific class (the backdoor target class). Based on these two weaknesses, we propose a general learning scheme, Anti-Backdoor Learning (ABL), to automatically prevent backdoor attacks during training. ABL introduces a two-stage \emph{gradient ascent} mechanism for standard training to 1) help isolate backdoor examples at an early training stage, and 2) break the correlation between backdoor examples and the target class at a later training stage. Through extensive experiments on multiple benchmark datasets against 10 state-of-the-art attacks, we empirically show that ABL-trained models on backdoor-poisoned data achieve the same performance as they were trained on purely clean data. Code is available at \url{https://github.com/bboylyg/ABL}.

翻译：后门攻击已成为对深层神经网络( DNNS ) 的重大安全威胁。虽然现有的防御方法在发现或消除后门数据部分方面显示了令人乐观的成果, 但仍不清楚能否设计出强有力的培训方法来防止后门触发器被注入到经过训练的模式中。在本文中, 我们引入了“ emph{ anti- brown learning} 概念, 旨在培训后门数据提供的模式。我们把整个学习过程设置为学习\ emph{ clean} 和\ emph{ backdoor} 数据部分的双重任务。从这个角度, 我们发现后门攻击的两个内在特征是其弱点:(1) 后门攻击的内在特征比学习清洁数据要快得多, 而后门学习的速度越快;(2) 后门任务与特定的类( 后门目标类) 有关。基于这两个弱点, 我们提出一个通用学习计划, 反后门学习( ABL), 自动防止后门攻击的内在特性特征攻击, 在培训的AL 阶段, 在后门级训练中, 将一个标准级, 以Breal- breportal viol- beal viol

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日