共谋者:以革命为指导的愿景转变者 (Conviformers: Convolutionally guided Vision Transformer) - 专知论文

会员服务 ·

0

Analysis · Vision · 变换 · 计算成本 · Extensibility ·

2022 年 8 月 28 日

Conviformers: Convolutionally guided Vision Transformer

翻译：共谋者:以革命为指导的愿景转变者

Mohit Vaishnav,Thomas Fel,Ivań Felipe Rodríguez,Thomas Serre

from arxiv, 12 pages; 4 Figures; 8 Tables

Vision transformers are nowadays the de-facto choice for image classification tasks. There are two broad categories of classification tasks, fine-grained and coarse-grained. In fine-grained classification, the necessity is to discover subtle differences due to the high level of similarity between sub-classes. Such distinctions are often lost as we downscale the image to save the memory and computational cost associated with vision transformers (ViT). In this work, we present an in-depth analysis and describe the critical components for developing a system for the fine-grained categorization of plants from herbarium sheets. Our extensive experimental analysis indicated the need for a better augmentation technique and the ability of modern-day neural networks to handle higher dimensional images. We also introduce a convolutional transformer architecture called Conviformer which, unlike the popular Vision Transformer (ConViT), can handle higher resolution images without exploding memory and computational cost. We also introduce a novel, improved pre-processing technique called PreSizer to resize images better while preserving their original aspect ratios, which proved essential for classifying natural plants. With our simple yet effective approach, we achieved SoTA on Herbarium 202x and iNaturalist 2019 dataset.

翻译：视觉变异器目前是图像分类任务的脱法选择。有两大类的分类任务, 精细的和粗粗的。在细细的分类中, 需要发现细细的分类, 细细的分类中, 由于子类之间的高度相似性而有细微的差别。当我们缩小图像以保存与视觉变异器(ViT)有关的内存和计算成本时, 这些区别往往会消失。在这项工作中, 我们提出一个深入分析, 描述开发精细的草原植物分类系统的关键组成部分。我们的广泛实验分析表明, 需要一种更好的增强技术, 以及现代神经网络处理更高维度图像的能力。我们还采用了一个叫作PreSizer的改进前处理技术, 以更好地调整图像的原始比例, 而这已证明了对自然植物分类至关重要。我们还采用了一个叫Convilalal变异器(ConVT)的系统结构, 这个结构与流行的视野变异器(ConVIT)不同, 可以处理更高分辨率的图像, 而不会爆炸记忆和计算成本。我们还引入了一种新型的改进的预处理技术, 叫做PreSizerate(Prezer) 和计算技术, size), 并保存原始的图像的原始比对自然植物进行分类至关重要。

0

相关内容

Analysis

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

NF-κB信号通路反馈性泛素化网络失控促食管非可控炎症诱导恶性转化的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Tfh细胞-IL21-B细胞轴在慢性乙型肝炎HBeAg血清学转换中的作用及补肾方的干预机制

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于胆碱能抗炎通路研究针刺治疗COPD的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

密集部署毫微微蜂窝环境下基于用户体验公平性的无线资源管理技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

siRNA c60m3诱导慢粒白血病细胞系向红系分化的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

室内基于WMNs 节点定位关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation

Arxiv

0+阅读 · 2022年10月18日

Brain Network Transformer

Brain Network Transformer

Arxiv

0+阅读 · 2022年10月15日

Neural Routing in Meta Learning

Neural Routing in Meta Learning

Arxiv

0+阅读 · 2022年10月14日

Parameter-Free Average Attention Improves Convolutional Neural Network Performance (Almost) Free of Charge

Arxiv

0+阅读 · 2022年10月14日

Green Hierarchical Vision Transformer for Masked Image Modeling

Arxiv

0+阅读 · 2022年10月14日

Quantification of entanglement with Siamese convolutional neural networks

Arxiv

0+阅读 · 2022年10月13日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Knowledge Graph Convolutional Networks for Recommender Systems with Label Smoothness Regularization

Arxiv

21+阅读 · 2019年5月11日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】面向时间序列基础模型的合成序列符号数据生成方法

军事通信市场七大趋势概述

【CMU博士论文】深度学习中泛化的量化、理解与改进

面向低光照图像增强的扩散模型

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation

Arxiv

0+阅读 · 2022年10月18日

Brain Network Transformer

Brain Network Transformer

Arxiv

0+阅读 · 2022年10月15日

Neural Routing in Meta Learning

Neural Routing in Meta Learning

Arxiv

0+阅读 · 2022年10月14日

Parameter-Free Average Attention Improves Convolutional Neural Network Performance (Almost) Free of Charge

Arxiv

0+阅读 · 2022年10月14日

Green Hierarchical Vision Transformer for Masked Image Modeling

Arxiv

0+阅读 · 2022年10月14日

Quantification of entanglement with Siamese convolutional neural networks

Arxiv

0+阅读 · 2022年10月13日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Knowledge Graph Convolutional Networks for Recommender Systems with Label Smoothness Regularization

Arxiv

21+阅读 · 2019年5月11日

Aspect Based Sentiment Analysis with Gated Convolutional Networks

Arxiv

12+阅读 · 2018年5月18日

相关基金

NF-κB信号通路反馈性泛素化网络失控促食管非可控炎症诱导恶性转化的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

Tfh细胞-IL21-B细胞轴在慢性乙型肝炎HBeAg血清学转换中的作用及补肾方的干预机制

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

基于胆碱能抗炎通路研究针刺治疗COPD的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

密集部署毫微微蜂窝环境下基于用户体验公平性的无线资源管理技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

siRNA c60m3诱导慢粒白血病细胞系向红系分化的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

室内基于WMNs 节点定位关键技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

shRNA干扰mTOR信号途径抑制镍诱导的Cap43基因表达的机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员