粗向正的愿景变异器 (Coarse-to-Fine Vision Transformer) - 专知论文

会员服务 ·

0

Vision · INFORMS · 变换 · network inference · Extensibility ·

2022 年 6 月 17 日

Coarse-to-Fine Vision Transformer

翻译：粗向正的愿景变异器

Mengzhao Chen,Mingbao Lin,Ke Li,Yunhang Shen,Yongjian Wu,Fei Chao,Rongrong Ji

Vision Transformers (ViT) have made many breakthroughs in computer vision tasks. However, considerable redundancy arises in the spatial dimension of an input image, leading to massive computational costs. Therefore, We propose a coarse-to-fine vision transformer (CF-ViT) to relieve computational burden while retaining performance in this paper. Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image. (2) Most images can be well recognized by a ViT model in a small-length token sequence. Therefore, our CF-ViT implements network inference in a two-stage manner. At coarse inference stage, an input image is split into a small-length patch sequence for a computationally economical classification. If not well recognized, the informative patches are identified and further re-split in a fine-grained granularity. Extensive experiments demonstrate the efficacy of our CF-ViT. For example, without any compromise on performance, CF-ViT reduces 53% FLOPs of LV-ViT, and also achieves 2.01x throughput.

翻译：计算机视觉变异器(VIT)在计算机视觉任务中取得了许多突破,然而,在输入图像的空间维度方面出现了大量冗余,导致大量计算成本。因此,我们建议使用粗到粗的视觉变异器(CF-VIT)来减轻计算负担,同时保留本文中的性能。我们提议的CF-ViT的动机是现代ViT模型中的两项重要观察:(1) 粗微的分块可以定位输入图像的信息区。(2) 多数图像都可以在微小的象征性序列中被ViT模型充分识别。因此,我们的CF-ViT以两阶段的方式执行网络推断。在粗略的推论阶段,输入图像被分成成一个小的计算经济分类的小型补位序列。如果没有很好地认识到,则会发现信息区间隙,并在细微的颗粒中进一步重新插入。广泛的实验表明我们的CF-VT的功效。例如,在性能方面没有任何妥协的情况下,CF-VT会降低53%的FLOPs。

0

相关内容

Vision

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

DMRTA1/RAGE调控肝脏胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

DOT1介导的H3K79甲基化修饰的调节机制

国家自然科学基金

0+阅读 · 2014年12月31日

溶胶增强Ni-TiO2纳米复合涂层的热稳定性及其微观机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Girdin通过肿瘤代谢对肺癌化疗疗效的影响及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属氧化物(Co3O4、Fe2O3及NiO)晶面取向与锂离子迁移速率之间的构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

铁电材料对FePt薄膜垂直磁各向异性调控机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型氟代HMG-CoA还原酶抑制剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

新型双光子比例计量型锌离子荧光探针的构筑

国家自然科学基金

0+阅读 · 2008年12月31日

P2T: Pyramid Pooling Transformer for Scene Understanding

Arxiv

0+阅读 · 2022年8月5日

RoFormer: Enhanced Transformer with Rotary Position Embedding

Arxiv

0+阅读 · 2022年8月5日

TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection

Arxiv

0+阅读 · 2022年8月4日

DAHiTrA: Damage Assessment Using a Novel Hierarchical Transformer Architecture

Arxiv

0+阅读 · 2022年8月3日

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Arxiv

0+阅读 · 2022年8月3日

Improving Transferability for Domain Adaptive Detection Transformers

Arxiv

0+阅读 · 2022年8月3日

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

Arxiv

0+阅读 · 2022年8月2日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Arxiv

15+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

network inference

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

P2T: Pyramid Pooling Transformer for Scene Understanding

Arxiv

0+阅读 · 2022年8月5日

RoFormer: Enhanced Transformer with Rotary Position Embedding

Arxiv

0+阅读 · 2022年8月5日

TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection

Arxiv

0+阅读 · 2022年8月4日

DAHiTrA: Damage Assessment Using a Novel Hierarchical Transformer Architecture

Arxiv

0+阅读 · 2022年8月3日

Patch Similarity Aware Data-Free Quantization for Vision Transformers

Arxiv

0+阅读 · 2022年8月3日

Improving Transferability for Domain Adaptive Detection Transformers

Arxiv

0+阅读 · 2022年8月3日

Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning

Arxiv

0+阅读 · 2022年8月2日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Arxiv

15+阅读 · 2018年8月2日

相关基金

DMRTA1/RAGE调控肝脏胰岛素抵抗的分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

DOT1介导的H3K79甲基化修饰的调节机制

国家自然科学基金

0+阅读 · 2014年12月31日

溶胶增强Ni-TiO2纳米复合涂层的热稳定性及其微观机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Girdin通过肿瘤代谢对肺癌化疗疗效的影响及其机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

过渡金属氧化物(Co3O4、Fe2O3及NiO)晶面取向与锂离子迁移速率之间的构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

Fe、Co、Ni超细纳米结构制备与催化放氢研究

国家自然科学基金

0+阅读 · 2012年12月31日

组蛋白甲基化修饰调控拟南芥冷响应基因TCF1的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

铁电材料对FePt薄膜垂直磁各向异性调控机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型氟代HMG-CoA还原酶抑制剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

新型双光子比例计量型锌离子荧光探针的构筑

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员