向学生隐瞒什么:关注引导的蒙面图像建模 (What to Hide from Your Students: Attention-Guided Masked Image Modeling) - 专知论文

会员服务 ·

0

掩码 · Learning · 词元分析器 · 掩码语言模型化 · 变换 ·

2022 年 7 月 22 日

What to Hide from Your Students: Attention-Guided Masked Image Modeling

翻译：向学生隐瞒什么:关注引导的蒙面图像建模

Ioannis Kakogeorgiou,Spyros Gidaris,Bill Psomas,Yannis Avrithis,Andrei Bursuc,Konstantinos Karantzalos,Nikos Komodakis

from arxiv, ECCV 2022. Codes and models are available at https://github.com/gkakogeorgiou/attmask

Transformers and masked language modeling are quickly being adopted and explored in computer vision as vision transformers and masked image modeling (MIM). In this work, we argue that image token masking differs from token masking in text, due to the amount and correlation of tokens in an image. In particular, to generate a challenging pretext task for MIM, we advocate a shift from random masking to informed masking. We develop and exhibit this idea in the context of distillation-based MIM, where a teacher transformer encoder generates an attention map, which we use to guide masking for the student. We thus introduce a novel masking strategy, called attention-guided masking (AttMask), and we demonstrate its effectiveness over random masking for dense distillation-based MIM as well as plain distillation-based self-supervised learning on classification tokens. We confirm that AttMask accelerates the learning process and improves the performance on a variety of downstream tasks. We provide the implementation code at https://github.com/gkakogeorgiou/attmask.

翻译：在这项工作中,我们争辩说,图像标记面罩与文字上的象征面罩不同,因为图像上的象征面罩的数量和相关性。特别是,为了给MIM制造一个具有挑战性的借口任务,我们主张从随机遮罩转向知情遮罩。我们在基于蒸馏的MIM背景下开发和展示这一想法,教师变压器编码器生成了一个关注地图,用来指导学生的遮罩。我们因此引入了一种新颖的遮罩策略,称为注意引导遮罩(AttMask),我们展示了它的效力,而不是随机遮罩,用于浓密蒸馏基的MIM,以及基于纯蒸馏的自我监视的分类符号学习。我们确认,AttMask加快了学习过程,改进了下游任务的各种性能。我们在https://github.com/gkakogeiou/attmask提供了执行代码。

0

相关内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

高血压血管重塑中血管保护分子CREG基因表达的上游调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

以FGF8为靶点的前列腺癌分子显像与治疗研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多尺度各向异性高斯核的图像高分辨率边缘检测理论与方法

国家自然科学基金

0+阅读 · 2013年12月31日

血管紧张素II受体基因多态性与原发性醛固酮增多症发病风险、亚型及预后的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

图们江流域农村生活污水处理中Atmosphere-Exposed Biofilm的净化机理及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SOCS-3基因多态性与CHC合并胰岛素抵抗的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

CLU,CR1，PICALM基因多态性及相关因素与内蒙古蒙、汉族阿尔茨海默病人群的病例-对照研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像处理的各向异性演化格子波尔兹曼模型及快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

基于多模态脑图像的老年痴呆症早期诊断关键技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

Panoramic Vision Transformer for Saliency Detection in 360° Videos

Arxiv

0+阅读 · 2022年9月19日

Holistic Attention-Fusion Adversarial Network for Single Image Defogging

Arxiv

0+阅读 · 2022年9月16日

PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6 DoF Tracking

Arxiv

0+阅读 · 2022年9月15日

One-Shot Synthesis of Images and Segmentation Masks

Arxiv

0+阅读 · 2022年9月15日

Exploring Visual Interpretability for Contrastive Language-Image Pre-training

Arxiv

0+阅读 · 2022年9月15日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

词元分析器

掩码语言模型化

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于大型语言模型的软件工程自动化研究》最新264页

《基于大型语言模型的信号处理管线研究：推进军事电子情报工作流程》最新76页

中文版 | 战争算法：生成式人工智能在战场的崛起

中文版《美国陆军：战术行为性远程医疗实施观察与建议》

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

相关论文

Panoramic Vision Transformer for Saliency Detection in 360° Videos

Arxiv

0+阅读 · 2022年9月19日

Holistic Attention-Fusion Adversarial Network for Single Image Defogging

Arxiv

0+阅读 · 2022年9月16日

PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6 DoF Tracking

Arxiv

0+阅读 · 2022年9月15日

One-Shot Synthesis of Images and Segmentation Masks

Arxiv

0+阅读 · 2022年9月15日

Exploring Visual Interpretability for Contrastive Language-Image Pre-training

Arxiv

0+阅读 · 2022年9月15日

Contrastive learning of global and local features for medical image segmentation with limited annotations

Arxiv

19+阅读 · 2020年6月18日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

11+阅读 · 2018年1月18日

相关基金

高血压血管重塑中血管保护分子CREG基因表达的上游调控机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

以FGF8为靶点的前列腺癌分子显像与治疗研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多尺度各向异性高斯核的图像高分辨率边缘检测理论与方法

国家自然科学基金

0+阅读 · 2013年12月31日

血管紧张素II受体基因多态性与原发性醛固酮增多症发病风险、亚型及预后的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

图们江流域农村生活污水处理中Atmosphere-Exposed Biofilm的净化机理及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

SOCS-3基因多态性与CHC合并胰岛素抵抗的功能研究

国家自然科学基金

0+阅读 · 2012年12月31日

CLU,CR1，PICALM基因多态性及相关因素与内蒙古蒙、汉族阿尔茨海默病人群的病例-对照研究

国家自然科学基金

0+阅读 · 2012年12月31日

图像处理的各向异性演化格子波尔兹曼模型及快速算法

国家自然科学基金

0+阅读 · 2011年12月31日

基于多模态脑图像的老年痴呆症早期诊断关键技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员