培训只有2040年图像的愿景变换者 (Training Vision Transformers with Only 2040 Images) - 专知论文

会员服务 ·

0

Vision · 变换 · ImageNet (数据集) · 归纳偏好 · 数据集 ·

2022 年 1 月 26 日

Training Vision Transformers with Only 2040 Images

翻译：培训只有2040年图像的愿景变换者

Yun-Hao Cao,Hao Yu,Jianxin Wu

from arxiv, 11 pages

Vision Transformers (ViTs) is emerging as an alternative to convolutional neural networks (CNNs) for visual recognition. They achieve competitive results with CNNs but the lack of the typical convolutional inductive bias makes them more data-hungry than common CNNs. They are often pretrained on JFT-300M or at least ImageNet and few works study training ViTs with limited data. In this paper, we investigate how to train ViTs with limited data (e.g., 2040 images). We give theoretical analyses that our method (based on parametric instance discrimination) is superior to other methods in that it can capture both feature alignment and instance similarities. We achieve state-of-the-art results when training from scratch on 7 small datasets under various ViT backbones. We also investigate the transferring ability of small datasets and find that representations learned from small datasets can even improve large-scale ImageNet training.

翻译：视觉变异器(Viet Generals)正在出现,以替代进化神经网络(CNN)进行视觉识别。它们与CNN取得了竞争性结果,但缺乏典型的进化感偏差使他们比普通CNN更渴望数据。他们通常在JFT-300M或至少图像网上接受过预先培训,而且很少用有限的数据对ViTs进行工作研究培训。在本文中,我们调查如何用有限的数据(例如2040图像)培训ViTs。我们进行了理论分析,认为我们的方法(基于参数实例歧视)优于其他方法,因为它能够捕捉特征对齐和实例相似性。当从零到培训维特各主干下的7个小数据集时,我们取得了最先进的结果。我们还调查了小型数据集的传输能力,并发现从小数据集中学会的表述甚至可以改进大规模图像网培训。

0

相关内容

Vision

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

TSHR易感位点导致Graves病人群TRAb持续阳性的机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

复杂环境下交通视频分析的若干关键技术研究

国家自然科学基金

2+阅读 · 2013年12月31日

大尺寸高分辨率差异图像的结构化分层细分配准研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

利用fMRI探讨带状疱疹后遗神经痛的自发性疼痛特点的中枢机制

国家自然科学基金

0+阅读 · 2013年12月31日

高光谱遥感图像的频域特征提取与分类研究

国家自然科学基金

2+阅读 · 2013年12月31日

动态几何分析与三维重建

国家自然科学基金

2+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Multimodal Token Fusion for Vision Transformers

Arxiv

3+阅读 · 2022年4月19日

Salient Objects in Clutter

Salient Objects in Clutter

Arxiv

0+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

VDTR: Video Deblurring with Transformer

Arxiv

0+阅读 · 2022年4月17日

Image Captioning In the Transformer Age

Arxiv

1+阅读 · 2022年4月15日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

VIP会员

文章信息

相关主题

ImageNet (数据集)

相关VIP内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

心之所向的无尽蓝，vivo S12 Pro「屿蓝」

ZEALER订阅号

0+阅读 · 2022年1月27日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Learning Trajectory-Aware Transformer for Video Super-Resolution

Arxiv

0+阅读 · 2022年4月20日

Multimodal Token Fusion for Vision Transformers

Arxiv

3+阅读 · 2022年4月19日

Salient Objects in Clutter

Salient Objects in Clutter

Arxiv

0+阅读 · 2022年4月18日

Dynamic Position Encoding for Transformers

Arxiv

1+阅读 · 2022年4月18日

VDTR: Video Deblurring with Transformer

Arxiv

0+阅读 · 2022年4月17日

Image Captioning In the Transformer Age

Arxiv

1+阅读 · 2022年4月15日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

相关基金

TSHR易感位点导致Graves病人群TRAb持续阳性的机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

JNK-Annexin A7 信号转导通路对小鼠腹水型肝癌干细胞生物学功能的影响

国家自然科学基金

0+阅读 · 2015年12月31日

复杂环境下交通视频分析的若干关键技术研究

国家自然科学基金

2+阅读 · 2013年12月31日

大尺寸高分辨率差异图像的结构化分层细分配准研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

利用fMRI探讨带状疱疹后遗神经痛的自发性疼痛特点的中枢机制

国家自然科学基金

0+阅读 · 2013年12月31日

高光谱遥感图像的频域特征提取与分类研究

国家自然科学基金

2+阅读 · 2013年12月31日

动态几何分析与三维重建

国家自然科学基金

2+阅读 · 2012年12月31日

车载激光扫描点云与全景影像的高精度配准方法

国家自然科学基金

0+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员