视觉关注网络 (Visual Attention Network) - 专知论文

会员服务 ·

0

Attention · Networking · Vision · SimPLe · Neural Networks ·

2022 年 7 月 11 日

Visual Attention Network

翻译：视觉关注网络

Meng-Hao Guo,Cheng-Ze Lu,Zheng-Ning Liu,Ming-Ming Cheng,Shi-Min Hu

from arxiv, Code is available at https://github.com/Visual-Attention-Network

While originally designed for natural language processing tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D nature of images brings three challenges for applying self-attention in computer vision. (1) Treating images as 1D sequences neglects their 2D structures. (2) The quadratic complexity is too expensive for high-resolution images. (3) It only captures spatial adaptability but ignores channel adaptability. In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings. Furthermore, we present a neural network based on LKA, namely Visual Attention Network (VAN). While extremely simple, VAN surpasses similar size vision transformers(ViTs) and convolutional neural networks(CNNs) in various tasks, including image classification, object detection, semantic segmentation, panoptic segmentation, pose estimation, etc. For example, VAN-B6 achieves 87.8% accuracy on ImageNet benchmark and set new state-of-the-art performance (58.2 PQ) for panoptic segmentation. Besides, VAN-B2 surpasses Swin-T 4% mIoU (50.1 vs. 46.1) for semantic segmentation on ADE20K benchmark, 2.6% AP (48.8 vs. 46.2) for object detection on COCO dataset. It provides a novel method and a simple yet strong baseline for the community. Code is available at https://github.com/Visual-Attention-Network.

翻译：虽然最初设计用于自然语言处理任务,但自留机制最近通过风暴采取了各种计算机视觉领域,然而,图像的2D性质给在计算机视觉中应用自留带来了三个挑战。 (1) 将图像视为1D序列忽视了2D结构。 (2) 二次复杂度对于高分辨率图像来说太昂贵。 (3) 二次复杂度对于高分辨率图像来说太昂贵了。 (3) 它只包含空间适应性,而忽视频道的适应性。在本文中,我们提议一种新颖的线性关注点命名为大内核关注(LKA),以便能够在自留时实现自我适应性和远程关联,同时避免其缺陷。此外,我们展示了一个基于LKA的神经网络网络网络网络网络网络,即视觉关注网络网络。虽然非常简单, VAN超过类似规模的视觉变异器(VT) 和演动神经网络网络网络网络网络,包括图像分类、物体探测、电路段分解、表估计等。 VAN-B6在图像网络基准基准上达到87.8%的精确度基准,并设置新的状态-目标目标目标目标基准值2,SA+50的SOlex-SOlex-S.

0

相关内容

Attention

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

M1胆碱受体对AMPA受体GluA1亚基的调控及其在突触长时程增强和学习记忆中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

二氧化钛外延单晶薄膜的制备及其特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

拟南芥综合生物信息学在线服务平台构建

国家自然科学基金

1+阅读 · 2013年12月31日

Kalirin 7 在雌激素调节海马神经元可塑性中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

水声网络跨层设计中的信道-网络联合编码技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

人体伤害性疼痛的特异性电生理指标检测与提取

国家自然科学基金

0+阅读 · 2012年12月31日

光遗传学研究基底前脑胆碱能神经元在睡眠觉醒中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向大规模数据的机器学习算法研究

国家自然科学基金

9+阅读 · 2011年12月31日

基于视皮层感知机制的生物启发运动特征层次化模型

国家自然科学基金

0+阅读 · 2011年12月31日

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

Arxiv

0+阅读 · 2022年8月30日

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

Arxiv

0+阅读 · 2022年8月30日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

Github项目推荐 | 语义分割、实例分割、全景分割和视频分割的论文和基准列表

AI研习社

32+阅读 · 2019年4月5日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

Self-Supervised Pyramid Representation Learning for Multi-Label Visual Analysis and Beyond

Arxiv

0+阅读 · 2022年8月30日

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

Arxiv

0+阅读 · 2022年8月30日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Arxiv

14+阅读 · 2018年3月14日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

M1胆碱受体对AMPA受体GluA1亚基的调控及其在突触长时程增强和学习记忆中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

暖白光LED用低光衰高显色性Lu3Al5-x(Si/B)xO12-yNy:Ce荧光粉的研究

国家自然科学基金

0+阅读 · 2014年12月31日

二氧化钛外延单晶薄膜的制备及其特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

拟南芥综合生物信息学在线服务平台构建

国家自然科学基金

1+阅读 · 2013年12月31日

Kalirin 7 在雌激素调节海马神经元可塑性中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

水声网络跨层设计中的信道-网络联合编码技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

人体伤害性疼痛的特异性电生理指标检测与提取

国家自然科学基金

0+阅读 · 2012年12月31日

光遗传学研究基底前脑胆碱能神经元在睡眠觉醒中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向大规模数据的机器学习算法研究

国家自然科学基金

9+阅读 · 2011年12月31日

基于视皮层感知机制的生物启发运动特征层次化模型

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员