为自我监督的愿景转变者学习 (Patch-level Representation Learning for Self-supervised Vision Transformers) - 专知论文

会员服务 ·

0

Learning · SSL · Performer · Vision · 变换 ·

2022 年 6 月 16 日

Patch-level Representation Learning for Self-supervised Vision Transformers

翻译：为自我监督的愿景转变者学习

Sukmin Yun,Hankook Lee,Jaehyung Kim,Jinwoo Shin

from arxiv, Accepted to CVPR 2022. Code is available at https://github.com/alinlab/SelfPatch

Recent self-supervised learning (SSL) methods have shown impressive results in learning visual representations from unlabeled images. This paper aims to improve their performance further by utilizing the architectural advantages of the underlying neural network, as the current state-of-the-art visual pretext tasks for SSL do not enjoy the benefit, i.e., they are architecture-agnostic. In particular, we focus on Vision Transformers (ViTs), which have gained much attention recently as a better architectural choice, often outperforming convolutional networks for various visual tasks. The unique characteristic of ViT is that it takes a sequence of disjoint patches from an image and processes patch-level representations internally. Inspired by this, we design a simple yet effective visual pretext task, coined SelfPatch, for learning better patch-level representations. To be specific, we enforce invariance against each patch and its neighbors, i.e., each patch treats similar neighboring patches as positive samples. Consequently, training ViTs with SelfPatch learns more semantically meaningful relations among patches (without using human-annotated labels), which can be beneficial, in particular, to downstream tasks of a dense prediction type. Despite its simplicity, we demonstrate that it can significantly improve the performance of existing SSL methods for various visual tasks, including object detection and semantic segmentation. Specifically, SelfPatch significantly improves the recent self-supervised ViT, DINO, by achieving +1.3 AP on COCO object detection, +1.2 AP on COCO instance segmentation, and +2.9 mIoU on ADE20K semantic segmentation.

翻译：最近自我监督的学习方法(SSL)显示,在从未贴标签的图像中学习视觉表现方面,取得了令人印象深刻的成果。本文的目的是通过利用基本神经网络的建筑优势来进一步提高其绩效,因为目前SSL最先进的视觉借口任务没有获得好处,即,它们是建筑-不可知性。特别是,我们把重点放在视野变异器(Vivis 变异器)上,这些变异器最近作为一个更好的建筑选择得到了很大的关注,往往优于各种视觉任务的共振网络。ViT的独特特征是,它需要从图像和流程中取出一系列脱节补丁补丁。为此,我们设计了一个简单而有效的视觉借口任务,即SSL目前最先进的自定义任务,即对每个补补丁及其邻居,即每个补丁处理相似的近邻补补补补补补丁,作为正面的样本。因此,对ViPT的训练,Selpatch的特征是,在一些补丁(不使用最新的人注解码标签)中学习更有意义的关系。我们为此设计了一个简单且能显示SLSLSL的自我测评的深度任务。

0

相关内容

Learning

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

OsVTC1与OsAUX1响应水稻耐铵信号cross-talk的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

四神丸调控Nemo及NLK/TCF/LET信号泛素化预防慢性溃疡性结肠炎复发的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号系统对骨髓间充质干细胞向GABA能神经元分化的调节作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

铈酸钡结构调控以及电学机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

S1P受体激动剂FTY720对心肌缺血再灌注损伤保护机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

靶向抑制Hedgehog/EGFR对胰腺癌的治疗作用及其交叉对话机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Notch信号通路负性调控哮喘小鼠气道杯状细胞MUC5AC的合成及其机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型pincer配合物的设计合成及其结构和催化活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis

Arxiv

0+阅读 · 2022年8月3日

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Arxiv

0+阅读 · 2022年8月3日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

不确定环境下无人机三维路径规划研究 | 221页

远征作战军事后勤规划

大语言模型将如何改变军事指挥结构

美陆军能力集成与开发系统（ACIDS）流程指南 | 2025最新122页

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis

Arxiv

0+阅读 · 2022年8月3日

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Arxiv

0+阅读 · 2022年8月3日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

A Simple Framework for Contrastive Learning of Visual Representations

Arxiv

21+阅读 · 2020年2月13日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

OsVTC1与OsAUX1响应水稻耐铵信号cross-talk的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

四神丸调控Nemo及NLK/TCF/LET信号泛素化预防慢性溃疡性结肠炎复发的作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Notch信号系统对骨髓间充质干细胞向GABA能神经元分化的调节作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

铈酸钡结构调控以及电学机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

S1P受体激动剂FTY720对心肌缺血再灌注损伤保护机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

靶向抑制Hedgehog/EGFR对胰腺癌的治疗作用及其交叉对话机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Notch信号通路负性调控哮喘小鼠气道杯状细胞MUC5AC的合成及其机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型pincer配合物的设计合成及其结构和催化活性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员