实现视觉学习理解:全面审查 (Transformers Meet Visual Learning Understanding: A Comprehensive Review) - 专知论文

会员服务 ·

0

变换 · 学成 · Performer · MoDELS · Vision ·

2022 年 3 月 24 日

Transformers Meet Visual Learning Understanding: A Comprehensive Review

翻译：实现视觉学习理解:全面审查

Yuting Yang,Licheng Jiao,Xu Liu,Fang Liu,Shuyuan Yang,Zhixi Feng,Xu Tang

from arxiv, arXiv admin note: text overlap with arXiv:2010.11929, arXiv:1706.03762 by other authors

Dynamic attention mechanism and global modeling ability make Transformer show strong feature learning ability. In recent years, Transformer has become comparable to CNNs methods in computer vision. This review mainly investigates the current research progress of Transformer in image and video applications, which makes a comprehensive overview of Transformer in visual learning understanding. First, the attention mechanism is reviewed, which plays an essential part in Transformer. And then, the visual Transformer model and the principle of each module are introduced. Thirdly, the existing Transformer-based models are investigated, and their performance is compared in visual learning understanding applications. Three image tasks and two video tasks of computer vision are investigated. The former mainly includes image classification, object detection, and image segmentation. The latter contains object tracking and video classification. It is significant for comparing different models' performance in various tasks on several public benchmark data sets. Finally, ten general problems are summarized, and the developing prospects of the visual Transformer are given in this review.

翻译：动态关注机制和全球建模能力使变异器显示出很强的学习能力。近年来,变异器在计算机视觉中已经与CNN方法相近。本审查主要调查变异器在图像和视频应用方面的当前研究进展,对变异器在视觉学习理解方面的全面概述。首先,对注意机制进行审查,在变异器中扮演重要角色。然后,引入了视觉变异器模型和每个模块的原则。第三,对现有变异器模型进行了调查,并在视觉学习应用中比较了它们的性能。调查了三种图像任务和计算机视觉的两种视频任务。前者主要包括图像分类、对象探测和图像分割。后者包含对象跟踪和图像分类。对于比较不同模型在若干公共基准数据集中的各项任务的业绩具有重要意义。最后,总结了10个一般性问题,并在本次审查中介绍了视觉变异器的发展前景。

28

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

副溶血弧菌VI型分泌系统的表型功能及基因调控研究

国家自然科学基金

1+阅读 · 2014年12月31日

精神分裂症易感因子ErbB4对篮状细胞和吊灯状细胞神经环路发育的调控和机制

国家自然科学基金

0+阅读 · 2014年12月31日

II-VI族半导体芯/壳纳米线异质结中的应变和掺杂

国家自然科学基金

0+阅读 · 2014年12月31日

跟踪器融合的视觉跟踪方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

果胶微波降解过程中的分子链变化及其化学反应基础

国家自然科学基金

0+阅读 · 2013年12月31日

基于artifact的跨组织业务流非功能性需求变化分析与应对机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Co基过渡金属合金团簇的结构和磁性理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

基于多尺度随机场模型的高分辨率遥感影像分割方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

57+阅读 · 2021年5月24日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

美国武装部队面临战车可维护性问题

《5G测试平台：探索5G在军事场景中的赋能平台》

海军无人系统：海上作战的演进而非革命

《未来无人海军系统：海上无人机效能增强与作战升级概览》2025最新93页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Multi-Task Learning for Visual Scene Understanding

Arxiv

29+阅读 · 2022年3月28日

Transformers in Medical Image Analysis: A Review

Transformers in Medical Image Analysis: A Review

Arxiv

40+阅读 · 2022年2月24日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

Recent Advances and Trends in Multimodal Deep Learning: A Review

Arxiv

57+阅读 · 2021年5月24日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Deep Reinforcement Learning: An Overview

Arxiv

15+阅读 · 2018年6月23日

相关基金

副溶血弧菌VI型分泌系统的表型功能及基因调控研究

国家自然科学基金

1+阅读 · 2014年12月31日

精神分裂症易感因子ErbB4对篮状细胞和吊灯状细胞神经环路发育的调控和机制

国家自然科学基金

0+阅读 · 2014年12月31日

II-VI族半导体芯/壳纳米线异质结中的应变和掺杂

国家自然科学基金

0+阅读 · 2014年12月31日

跟踪器融合的视觉跟踪方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

果胶微波降解过程中的分子链变化及其化学反应基础

国家自然科学基金

0+阅读 · 2013年12月31日

基于artifact的跨组织业务流非功能性需求变化分析与应对机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Co基过渡金属合金团簇的结构和磁性理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀疏张量学习理论

国家自然科学基金

1+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

基于多尺度随机场模型的高分辨率遥感影像分割方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员