Title: 如何后门扰动模型? (How to Backdoor Diffusion Models?) - 专知论文

会员服务 ·

0

扰动 · 扩散模型 · 扩散过程 · 攻击 · 潜在 ·

2023 年 4 月 21 日

How to Backdoor Diffusion Models?

翻译：Title: 如何后门扰动模型?

Sheng-Yen Chou,Pin-Yu Chen,Tsung-Yi Ho

from arxiv, Accepted by CVPR 2023

Diffusion models are state-of-the-art deep learning empowered generative models that are trained based on the principle of learning forward and reverse diffusion processes via progressive noise-addition and denoising. To gain a better understanding of the limitations and potential risks, this paper presents the first study on the robustness of diffusion models against backdoor attacks. Specifically, we propose BadDiffusion, a novel attack framework that engineers compromised diffusion processes during model training for backdoor implantation. At the inference stage, the backdoored diffusion model will behave just like an untampered generator for regular data inputs, while falsely generating some targeted outcome designed by the bad actor upon receiving the implanted trigger signal. Such a critical risk can be dreadful for downstream tasks and applications built upon the problematic model. Our extensive experiments on various backdoor attack settings show that BadDiffusion can consistently lead to compromised diffusion models with high utility and target specificity. Even worse, BadDiffusion can be made cost-effective by simply finetuning a clean pre-trained diffusion model to implant backdoors. We also explore some possible countermeasures for risk mitigation. Our results call attention to potential risks and possible misuse of diffusion models. Our code is available on https://github.com/IBM/BadDiffusion.

翻译：Abstract: 扰动模型是基于学习正向和反向扩散过程的原理，通过逐步添加噪声和去噪而训练的最先进的深度学习强化生成模型。为了更好地了解其局限性和潜在风险，本文首次对扰动攻击下的扩散模型的鲁棒性进行了研究。具体地，我们提出 BadDiffusion，一种新型攻击框架，在模型训练过程中工程化受损扩散过程以进行后门植入。在推断阶段，带后门的扩散模型对于常规数据输入将表现得像一个未篡改的生成器，但在接收到植入的触发信号时将错误地生成一些受坏意操作者设计的目标输出。这种重大风险可能对建立在有问题的模型之上的下游任务和应用程序构成可怕的影响。我们在各种后门攻击设置下进行了广泛的实验，结果表明 BadDiffusion 可以始终导致具有高效用性和目标特异性的受损扩散模型。更糟糕的是，BadDiffusion 可以通过简单地对干净的预训练扩散模型进行微调来实现成本效益。我们还探讨了一些可能降低风险的对策。我们的结果引起了对于扩散模型的潜在风险和可能的误用的关注。我们的代码可在 https://github.com/IBM/BadDiffusion 上获得。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

AAAI2022-无需蒸馏信号的对比学习小模型训练效能研究

AAAI2022-无需蒸馏信号的对比学习小模型训练效能研究

专知会员服务

17+阅读 · 2021年12月23日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

30+阅读 · 2020年11月21日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

机器之心

0+阅读 · 2022年9月27日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

miR-455双向调控MSCs成软骨分化和退变中表观遗传学去阻遏的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

HE4在缺氧诱导的肾小管上皮细胞细胞外基质沉积及肾脏纤维化的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146b负调控NF-κB信号通路抑制异常应力下椎间盘退变的机制

国家自然科学基金

0+阅读 · 2013年12月31日

高能质子束轰击重靶的剩余产物研究

国家自然科学基金

0+阅读 · 2013年12月31日

钛基催化剂光催化去除大气环境NO过程中副产物的形成机制与消除

国家自然科学基金

0+阅读 · 2012年12月31日

Ni基催化剂上CH4-CO2重整反应积碳研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻HL-CMS两个不育基因与两个恢复基因互作机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

寻找多氯联苯代谢途径中缺失的一环

国家自然科学基金

0+阅读 · 2009年12月31日

长寿基因SIRT1在细胞衰老过程中的转录调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

Arxiv

0+阅读 · 2023年6月6日

Explore to Generalize in Zero-Shot RL

Arxiv

0+阅读 · 2023年6月5日

Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

Arxiv

0+阅读 · 2023年6月5日

Enhance Diffusion to Improve Robust Generalization

Arxiv

0+阅读 · 2023年6月5日

Large Language Models can be Guided to Evade AI-Generated Text Detection

Arxiv

0+阅读 · 2023年6月5日

Revisiting Data-Free Knowledge Distillation with Poisoned Teachers

Arxiv

0+阅读 · 2023年6月4日

Training Data Attribution for Diffusion Models

Arxiv

0+阅读 · 2023年6月3日

Harnessing large-language models to generate private synthetic text

Arxiv

0+阅读 · 2023年6月2日

Diffusion Models in Vision: A Survey

Arxiv

29+阅读 · 2022年9月10日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

AAAI2022-无需蒸馏信号的对比学习小模型训练效能研究

AAAI2022-无需蒸馏信号的对比学习小模型训练效能研究

专知会员服务

17+阅读 · 2021年12月23日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

首篇《后门学习综述》论文发布，阐述AI系统训练过程的安全性问题

专知会员服务

30+阅读 · 2020年11月21日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

【清华大学】诊断和增强VAE模型，Diagnosing and Enhancing VAE Models

专知会员服务

37+阅读 · 2020年2月27日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

EMNLP 2022 | 北大提出基于中间层特征的在线文本后门防御新SOTA

PaperWeekly

0+阅读 · 2022年11月7日

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

机器之心

0+阅读 · 2022年9月27日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

Explanation-based Finetuning Makes Models More Robust to Spurious Cues

Arxiv

0+阅读 · 2023年6月6日

Explore to Generalize in Zero-Shot RL

Arxiv

0+阅读 · 2023年6月5日

Revisiting Personalized Federated Learning: Robustness Against Backdoor Attacks

Arxiv

0+阅读 · 2023年6月5日

Enhance Diffusion to Improve Robust Generalization

Arxiv

0+阅读 · 2023年6月5日

Large Language Models can be Guided to Evade AI-Generated Text Detection

Arxiv

0+阅读 · 2023年6月5日

Revisiting Data-Free Knowledge Distillation with Poisoned Teachers

Arxiv

0+阅读 · 2023年6月4日

Training Data Attribution for Diffusion Models

Arxiv

0+阅读 · 2023年6月3日

Harnessing large-language models to generate private synthetic text

Arxiv

0+阅读 · 2023年6月2日

Diffusion Models in Vision: A Survey

Arxiv

29+阅读 · 2022年9月10日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

相关基金

miR-455双向调控MSCs成软骨分化和退变中表观遗传学去阻遏的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

HE4在缺氧诱导的肾小管上皮细胞细胞外基质沉积及肾脏纤维化的作用机制

国家自然科学基金

0+阅读 · 2013年12月31日

miR-146b负调控NF-κB信号通路抑制异常应力下椎间盘退变的机制

国家自然科学基金

0+阅读 · 2013年12月31日

高能质子束轰击重靶的剩余产物研究

国家自然科学基金

0+阅读 · 2013年12月31日

钛基催化剂光催化去除大气环境NO过程中副产物的形成机制与消除

国家自然科学基金

0+阅读 · 2012年12月31日

Ni基催化剂上CH4-CO2重整反应积碳研究

国家自然科学基金

0+阅读 · 2012年12月31日

水稻HL-CMS两个不育基因与两个恢复基因互作机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

寻找多氯联苯代谢途径中缺失的一环

国家自然科学基金

0+阅读 · 2009年12月31日

长寿基因SIRT1在细胞衰老过程中的转录调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员