毒物编码器:未贴标签的培训前数据在反竞争学习中中毒 (PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning) - 专知论文

会员服务 ·

0

未标记 · contrastive · 对比学习 · 学成 · 特征提取器 ·

2022 年 5 月 17 日

PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning

翻译：毒物编码器:未贴标签的培训前数据在反竞争学习中中毒

Hongbin Liu,Jinyuan Jia,Neil Zhenqiang Gong

from arxiv, To appear in USENIX Security Symposium, 2022

Contrastive learning pre-trains an image encoder using a large amount of unlabeled data such that the image encoder can be used as a general-purpose feature extractor for various downstream tasks. In this work, we propose PoisonedEncoder, a data poisoning attack to contrastive learning. In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data, such that the downstream classifiers built based on the poisoned encoder for multiple target downstream tasks simultaneously classify attacker-chosen, arbitrary clean inputs as attacker-chosen, arbitrary classes. We formulate our data poisoning attack as a bilevel optimization problem, whose solution is the set of poisoning inputs; and we propose a contrastive-learning-tailored method to approximately solve it. Our evaluation on multiple datasets shows that PoisonedEncoder achieves high attack success rates while maintaining the testing accuracy of the downstream classifiers built upon the poisoned encoder for non-attacker-chosen inputs. We also evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses. Our results show that these defenses can decrease the attack success rate of PoisonedEncoder, but they also sacrifice the utility of the encoder or require a large clean pre-training dataset.

翻译：使用大量未贴标签的数据, 图像编码器可以用作各种下游任务的一般用途特征提取器。在这项工作中, 我们提议用数据中毒编码器来进行数据中毒袭击, 以对比性学习。特别是, 攻击者将中毒输入到未贴标签的训练前数据中, 这样下游分类器以有毒编码器为基础, 用于多个目标下游任务, 同时将攻击者选择的、任意的清洁输入作为攻击者选择的分类。我们还将我们的数据中毒攻击作为一种双级优化问题, 其解决方案是一组中毒投入; 我们提出一种对比性学习定制方法, 以大致解决这个问题。我们对多个数据集的评估显示, 中毒 Encoder 取得了高袭击成功率, 同时保持了以有毒编码器为基础的下游分类器的测试准确性, 用于非攻击者选择、任意分类。我们还评估了五种针对中毒 Encencoder 的防御系统, 包括一个预处理、三个在加工过程中、 3个定制的定制方法, 以及一个后期安全级的防御系统, 展示了这些大规模安全级的防御。

0

相关内容

未标记

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

3D多孔结构LiMnPO4•LiVPO4F@石墨烯气凝胶复合物材料的构筑及电化学性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

bHLH转录因子DYT1调控花药发育的结构基础和分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

NURF重塑复合物调控iPS细胞重编程作用及表观遗传机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

新型核壳结构氧化铜/导电聚合物纳米复合材料的设计合成与锂电池应用

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼miR-142a-3p影响及调控造血干细胞发育机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

宫颈癌干细胞的特异基因表达分析

国家自然科学基金

0+阅读 · 2009年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

激活成纤维细胞改善移植胰岛的再血管化

国家自然科学基金

0+阅读 · 2009年12月31日

S100A4蛋白对运动氧化应激心血管内皮细胞凋亡的调控

国家自然科学基金

0+阅读 · 2009年12月31日

Effective and Efficient Training for Sequential Recommendation using Recency Sampling

Arxiv

0+阅读 · 2022年7月6日

Learning from Multiple Unlabeled Datasets with Partial Risk Regularization

Learning from Multiple Unlabeled Datasets with Partial Risk Regularization

Arxiv

0+阅读 · 2022年7月4日

Learning to Rank with Small Set of Ground Truth Data

Arxiv

0+阅读 · 2022年7月4日

Self-Supervised Learning for Videos: A Survey

Arxiv

0+阅读 · 2022年6月18日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

VIP会员

文章信息

相关主题

特征提取器

相关VIP内容

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Effective and Efficient Training for Sequential Recommendation using Recency Sampling

Arxiv

0+阅读 · 2022年7月6日

Learning from Multiple Unlabeled Datasets with Partial Risk Regularization

Learning from Multiple Unlabeled Datasets with Partial Risk Regularization

Arxiv

0+阅读 · 2022年7月4日

Learning to Rank with Small Set of Ground Truth Data

Arxiv

0+阅读 · 2022年7月4日

Self-Supervised Learning for Videos: A Survey

Arxiv

0+阅读 · 2022年6月18日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Adversarial Machine Learning in Image Classification: A Survey Towards the Defender's Perspective

Arxiv

17+阅读 · 2020年9月8日

Learning from Few Samples: A Survey

Learning from Few Samples: A Survey

Arxiv

77+阅读 · 2020年7月30日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

相关基金

3D多孔结构LiMnPO4•LiVPO4F@石墨烯气凝胶复合物材料的构筑及电化学性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

bHLH转录因子DYT1调控花药发育的结构基础和分子机理

国家自然科学基金

0+阅读 · 2014年12月31日

NURF重塑复合物调控iPS细胞重编程作用及表观遗传机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

新型核壳结构氧化铜/导电聚合物纳米复合材料的设计合成与锂电池应用

国家自然科学基金

0+阅读 · 2012年12月31日

斑马鱼miR-142a-3p影响及调控造血干细胞发育机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

宫颈癌干细胞的特异基因表达分析

国家自然科学基金

0+阅读 · 2009年12月31日

Ni3Al基合金单晶生长规律研究

国家自然科学基金

0+阅读 · 2009年12月31日

激活成纤维细胞改善移植胰岛的再血管化

国家自然科学基金

0+阅读 · 2009年12月31日

S100A4蛋白对运动氧化应激心血管内皮细胞凋亡的调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员