视觉变异器调查 (A Survey on Visual Transformer) - 专知论文

会员服务 ·

0

变换 · Vision · Performer · Networking · Processing（编程语言） ·

2021 年 1 月 15 日

A Survey on Visual Transformer

翻译：视觉变异器调查

Kai Han,Yunhe Wang,Hanting Chen,Xinghao Chen,Jianyuan Guo,Zhenhua Liu,Yehui Tang,An Xiao,Chunjing Xu,Yixing Xu,Zhaohui Yang,Yiman Zhang,Dacheng Tao

Transformer is a type of deep neural network mainly based on self-attention mechanism which is originally applied in natural language processing field. Inspired by the strong representation ability of transformer, researchers propose to extend transformer for computer vision tasks. Transformer-based models show competitive and even better performance on various visual benchmarks compared to other network types such as convolutional networks and recurrent networks. With high performance and without inductive bias defined by human, transformer is receiving more and more attention from the visual community. In this paper we provide a literature review of these visual transformer models by categorizing them in different tasks and analyze the advantages and disadvantages of these methods. In particular, the main categories include the basic image classification, high-level vision, low-level vision and video processing. The self-attention in computer vision is also briefly revisited as self-attention is the base component in transformer. Efficient transformer methods are included for pushing transformer into real applications on the devices. Finally, we give a discussion about the challenges and further research directions for visual transformers.

翻译：变异器是一种深层神经网络,主要基于最初在自然语言处理领域应用的自我注意机制。受变异器强大代表能力的启发,研究人员提议扩大变异器,以完成计算机视觉任务。以变异器为基础的模型在各种视觉基准方面表现出竞争力,而且与其他网络类型相比,例如变异网络和经常性网络相比,表现更好。随着高性能和人类定义的无感应偏差,变异器越来越受到视觉界的注意。在本文中,我们提供了对这些视觉变异器模型的文献审查,将其分为不同任务,分析这些方法的优缺点。特别是,主要类别包括基本图像分类、高水平的视觉、低水平的视觉和视频处理。计算机视觉的自我注意也得到短暂的重新审视,因为自我注意是变异器的基本组成部分。包含有效的变异器方法,将变异变器推向设备上的实际应用。最后,我们讨论视觉变异器的挑战和进一步的研究方向。

0

相关内容

Transformer替代CNN？8篇论文概述最新进展！

Transformer替代CNN？8篇论文概述最新进展！

专知会员服务

77+阅读 · 2021年1月19日

华为等发布《视觉Transformer转换器》综述论文，21页pdf

华为等发布《视觉Transformer转换器》综述论文，21页pdf

专知会员服务

86+阅读 · 2020年12月25日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日

【图机器学习论文】综述：图注意力模型（Attention Models in Graphs: A Survey）

【图机器学习论文】综述：图注意力模型（Attention Models in Graphs: A Survey）

专知会员服务

143+阅读 · 2019年12月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

深度学习NLP相关资源大列表

深度学习NLP相关资源大列表

机器学习研究会

3+阅读 · 2017年9月17日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Deep Learning based 3D Segmentation: A Survey

Arxiv

0+阅读 · 2021年3月9日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Efficient Transformers: A Survey

Arxiv

23+阅读 · 2020年9月16日

3D Deep Learning on Medical Images: A Review

3D Deep Learning on Medical Images: A Review

Arxiv

13+阅读 · 2020年4月1日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

An Attentive Survey of Attention Models

Arxiv

19+阅读 · 2019年4月5日

A Comprehensive Survey on Graph Neural Networks

A Comprehensive Survey on Graph Neural Networks

Arxiv

21+阅读 · 2019年1月3日

Visual Interpretability for Deep Learning: a Survey

Arxiv

16+阅读 · 2018年2月7日

Graph Summarization: A Survey

Arxiv

5+阅读 · 2017年4月12日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

Transformer替代CNN？8篇论文概述最新进展！

Transformer替代CNN？8篇论文概述最新进展！

专知会员服务

77+阅读 · 2021年1月19日

华为等发布《视觉Transformer转换器》综述论文，21页pdf

华为等发布《视觉Transformer转换器》综述论文，21页pdf

专知会员服务

86+阅读 · 2020年12月25日

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

【视频预测深度学习综述论文】A Review on Deep Learning Techniques for Video Prediction

专知会员服务

52+阅读 · 2020年4月15日

【图机器学习论文】综述：图注意力模型（Attention Models in Graphs: A Survey）

【图机器学习论文】综述：图注意力模型（Attention Models in Graphs: A Survey）

专知会员服务

143+阅读 · 2019年12月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

视频摘要最新综述文章，Video Skimming: Taxonomy and Comprehensive Survey

专知会员服务

29+阅读 · 2019年10月13日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

「Github」多模态机器学习文章阅读列表

「Github」多模态机器学习文章阅读列表

专知

123+阅读 · 2019年8月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

深度学习NLP相关资源大列表

深度学习NLP相关资源大列表

机器学习研究会

3+阅读 · 2017年9月17日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Deep Learning based 3D Segmentation: A Survey

Arxiv

0+阅读 · 2021年3月9日

Image/Video Deep Anomaly Detection: A Survey

Arxiv

16+阅读 · 2021年3月2日

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Video Super Resolution Based on Deep Learning: A Comprehensive Survey

Arxiv

8+阅读 · 2020年12月20日

Efficient Transformers: A Survey

Arxiv

23+阅读 · 2020年9月16日

3D Deep Learning on Medical Images: A Review

3D Deep Learning on Medical Images: A Review

Arxiv

13+阅读 · 2020年4月1日

Image Segmentation Using Deep Learning: A Survey

Image Segmentation Using Deep Learning: A Survey

Arxiv

47+阅读 · 2020年1月15日

An Attentive Survey of Attention Models

Arxiv

19+阅读 · 2019年4月5日

A Comprehensive Survey on Graph Neural Networks

A Comprehensive Survey on Graph Neural Networks

Arxiv

21+阅读 · 2019年1月3日

Visual Interpretability for Deep Learning: a Survey

Arxiv

16+阅读 · 2018年2月7日

Graph Summarization: A Survey

Arxiv

5+阅读 · 2017年4月12日

微信扫码咨询专知VIP会员