Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies.
翻译:公众资源和服务(例如数据集、培训平台、预先培训的模式)已被广泛采用,以方便深层学习应用软件的发展;然而,如果第三方提供者不信任,它们可以将有毒样品输入数据集或嵌入这些模型的后门;这种违反诚信行为可造成严重后果,特别是在安全和保安关键应用程序方面;提出了各种后门攻击技术,以提高效力和隐秘性;不幸的是,现有的防御办法不切实际,无法全面挫败这些攻击;在本文件中,我们调查数据增强技术在减少后门攻击和加强DL模型的稳健性方面的效力;为实现这一目标,采用了一个评价框架;具体地说,我们考虑一种统一的防御解决办法,即:(1) 采用数据增强政策,对受感染的模型进行微调,消除嵌入后门的效应;(2) 采用另一种增强政策,对输入样品进行预处理,并在推断过程中使触发点失效;我们提议一种系统办法,通过全面评估71个州级攻击和DL模型的稳健性,发现防止不同后门攻击的最佳政策;我们现有一套未来防御工具可以有效地减少八州级防御工具。