适应性防御：防御模型适应中的通用攻击 (AdaptGuard: Defending Against Universal Attacks for Model Adaptation)

Model adaptation aims at solving the domain transfer problem under the constraint of only accessing the pretrained source models. With the increasing considerations of data privacy and transmission efficiency, this paradigm has been gaining recent popularity. This paper studies the vulnerability to universal attacks transferred from the source domain during model adaptation algorithms due to the existence of the malicious providers. We explore both universal adversarial perturbations and backdoor attacks as loopholes on the source side and discover that they still survive in the target models after adaptation. To address this issue, we propose a model preprocessing framework, named AdaptGuard, to improve the security of model adaptation algorithms. AdaptGuard avoids direct use of the risky source parameters through knowledge distillation and utilizes the pseudo adversarial samples under adjusted radius to enhance the robustness. AdaptGuard is a plug-and-play module that requires neither robust pretrained models nor any changes for the following model adaptation algorithms. Extensive results on three commonly used datasets and two popular adaptation methods validate that AdaptGuard can effectively defend against universal attacks and maintain clean accuracy in the target domain simultaneously. We hope this research will shed light on the safety and robustness of transfer learning.

翻译：模型适应旨在解决领域转移问题，条件是只能访问预训练的源模型。随着对数据隐私和传输效率的越来越多考虑，这种范式最近获得了流行。本文研究在模型适应算法中由于存在恶意提供者而对源域传输的通用攻击易受攻击性。我们探索通用对抗扰动和后门攻击作为源侧漏洞，并发现它们仍然在适应后的目标模型中生存。为解决这个问题，我们提出了一种模型预处理框架，名为AdaptGuard，以改善模型适应算法的安全性。AdaptGuard通过知识蒸馏避免直接使用风险源参数，并利用在调整半径下的伪对抗样本增强了鲁棒性。AdaptGuard是一个即插即用的模块，不需要强韧的预训练模型或任何对接下来的模型适应算法的更改。对三个常用数据集和两种流行的适应方法的广泛结果验证AdaptGuard可以有效地防御通用攻击并同时保持干净的准确性。我们希望这项研究将为转移学习的安全性和鲁棒性带来启示。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《基于博弈论的高级持续威胁（APT）防御方法》25页最新论文

专知会员服务

46+阅读 · 2022年10月8日

【CVPR 2022】可转移的稀疏对抗性攻击，Transferable Sparse Adversarial Attack

专知会员服务

15+阅读 · 2022年3月12日

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【KDD2020-Tutorial】对抗性的攻击和防御:前沿、进展与实践，171页ppt

专知会员服务

80+阅读 · 2020年8月24日