共同比额表Conv-有意图像变异器 (Co-Scale Conv-Attentional Image Transformers) - 专知论文

会员服务 ·

0

变换 · 位置嵌入 · 图像分类器 · MoDELS · Integration ·

2021 年 8 月 26 日

Co-Scale Conv-Attentional Image Transformers

翻译：共同比额表Conv-有意图像变异器

Weijian Xu,Yifan Xu,Tyler Chang,Zhuowen Tu

from arxiv, Accepted to ICCV 2021 (Oral)

In this paper, we present Co-scale conv-attentional image Transformers (CoaT), a Transformer-based image classifier equipped with co-scale and conv-attentional mechanisms. First, the co-scale mechanism maintains the integrity of Transformers' encoder branches at individual scales, while allowing representations learned at different scales to effectively communicate with each other; we design a series of serial and parallel blocks to realize the co-scale mechanism. Second, we devise a conv-attentional mechanism by realizing a relative position embedding formulation in the factorized attention module with an efficient convolution-like implementation. CoaT empowers image Transformers with enriched multi-scale and contextual modeling capabilities. On ImageNet, relatively small CoaT models attain superior classification results compared with similar-sized convolutional neural networks and image/vision Transformers. The effectiveness of CoaT's backbone is also illustrated on object detection and instance segmentation, demonstrating its applicability to downstream computer vision tasks.

翻译：在本文中,我们介绍一个基于变压器的图像分类器(CoaT),这是一个基于变压器的图像分类器,配有共标和共控机制。首先,共同标的机制保持每个规模的变压器编码器分支的完整性,同时允许在不同规模上所学的表达方式有效地相互交流;我们设计了一系列序列块和平行块以实现共同标尺机制。第二,我们设计了一个共标机制,将相对位置的配方嵌入集成成的模块中,并有一个高效的相控式类似的实施。CoaT赋予图像变压器以富集多尺度和上下文模型的能力。在图像网络上,相对较小的 CoaT模型与类似规模的变压器神经网络和图像/视觉变压器相比,获得较高的分类结果。Coat的骨架在物体探测和实例分割上也展示了效果,表明其对下游计算机视觉任务的适用性。

0

相关内容

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

专知会员服务

110+阅读 · 2022年3月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

注意力机制模型最新综述

注意力机制模型最新综述

专知会员服务

270+阅读 · 2019年10月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

注意力机制介绍，Attention Mechanism

注意力机制介绍，Attention Mechanism

专知会员服务

171+阅读 · 2019年10月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

Attention最新进展

Attention最新进展

极市平台

5+阅读 · 2020年5月30日

综述：DenseNet—Dense卷积网络（图像分类）

综述：DenseNet—Dense卷积网络（图像分类）

专知

86+阅读 · 2018年11月26日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

HRFormer: High-Resolution Transformer for Dense Prediction

Arxiv

0+阅读 · 2021年10月18日

Tensor-to-Image: Image-to-Image Translation with Vision Transformers

Arxiv

1+阅读 · 2021年10月6日

Learning Hard Retrieval Decoder Attention for Transformers

Arxiv

0+阅读 · 2021年9月10日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

Scene-based Factored Attention for Image Captioning

Arxiv

4+阅读 · 2019年8月7日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

Arxiv

8+阅读 · 2019年2月8日

MDU-Net: Multi-scale Densely Connected U-Net for biomedical image segmentation

MDU-Net: Multi-scale Densely Connected U-Net for biomedical image segmentation

Arxiv

10+阅读 · 2018年12月4日

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Arxiv

7+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

图像分类器

相关VIP内容

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

【斯坦福大学课程】2021年深度多任务学习与元学习，CS 330: Deep Multi-Task and Meta Learning

专知会员服务

110+阅读 · 2022年3月2日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

【ICLR 2019】双曲注意力网络，Hyperbolic Attention Network

专知会员服务

84+阅读 · 2020年6月21日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

【Google论文】ALBERT:自我监督学习语言表达的精简BERT

专知会员服务

24+阅读 · 2019年11月4日

注意力机制模型最新综述

注意力机制模型最新综述

专知会员服务

270+阅读 · 2019年10月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

注意力机制介绍，Attention Mechanism

注意力机制介绍，Attention Mechanism

专知会员服务

171+阅读 · 2019年10月13日

热门VIP内容

开通专知VIP会员享更多权益服务

面向具身智能的多模态数据存储与检索：综述

《算法战争研究计划全景评估》35页

【CMU博士论文】水下三维视觉感知与生成

智能体战争：自主人工智能军备竞赛全景透视

相关资讯

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

一文读懂Attention机制

一文读懂Attention机制

机器学习与推荐算法

63+阅读 · 2020年6月9日

Attention最新进展

Attention最新进展

极市平台

5+阅读 · 2020年5月30日

综述：DenseNet—Dense卷积网络（图像分类）

综述：DenseNet—Dense卷积网络（图像分类）

专知

86+阅读 · 2018年11月26日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

计算机视觉领域顶会CVPR 2018 接受论文列表

计算机视觉领域顶会CVPR 2018 接受论文列表

专知

7+阅读 · 2018年5月26日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

HRFormer: High-Resolution Transformer for Dense Prediction

Arxiv

0+阅读 · 2021年10月18日

Tensor-to-Image: Image-to-Image Translation with Vision Transformers

Arxiv

1+阅读 · 2021年10月6日

Learning Hard Retrieval Decoder Attention for Transformers

Arxiv

0+阅读 · 2021年9月10日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

Scene-based Factored Attention for Image Captioning

Arxiv

4+阅读 · 2019年8月7日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Cross-Modal Self-Attention Network for Referring Image Segmentation

Arxiv

18+阅读 · 2019年4月9日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation

Arxiv

8+阅读 · 2019年2月8日

MDU-Net: Multi-scale Densely Connected U-Net for biomedical image segmentation

MDU-Net: Multi-scale Densely Connected U-Net for biomedical image segmentation

Arxiv

10+阅读 · 2018年12月4日

Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention

Arxiv

7+阅读 · 2018年5月21日

微信扫码咨询专知VIP会员