视觉识别双流网络 (Dual-stream Network for Visual Recognition) - 专知论文

会员服务 ·

0

表示容量 · INFORMS · Performer · state-of-the-art · Networking ·

2021 年 10 月 27 日

Dual-stream Network for Visual Recognition

翻译：视觉识别双流网络

Mingyuan Mao,Renrui Zhang,Honghui Zheng,Peng Gao,Teli Ma,Yan Peng,Errui Ding,Baochang Zhang,Shumin Han

from arxiv, Accepted by NeurIPS 2021

Transformers with remarkable global representation capacities achieve competitive results for visual tasks, but fail to consider high-level local pattern information in input images. In this paper, we present a generic Dual-stream Network (DS-Net) to fully explore the representation capacity of local and global pattern features for image classification. Our DS-Net can simultaneously calculate fine-grained and integrated features and efficiently fuse them. Specifically, we propose an Intra-scale Propagation module to process two different resolutions in each block and an Inter-Scale Alignment module to perform information interaction across features at dual scales. Besides, we also design a Dual-stream FPN (DS-FPN) to further enhance contextual information for downstream dense predictions. Without bells and whistles, the proposed DS-Net outperforms DeiT-Small by 2.4% in terms of top-1 accuracy on ImageNet-1k and achieves state-of-the-art performance over other Vision Transformers and ResNets. For object detection and instance segmentation, DS-Net-Small respectively outperforms ResNet-50 by 6.4% and 5.5% in terms of mAP on MSCOCO 2017, and surpasses the previous state-of-the-art scheme, which significantly demonstrates its potential to be a general backbone in vision tasks. The code will be released soon.

翻译：具有显著全球代表性能力的变异器在视觉任务方面实现竞争性结果,但未能考虑投入图像中的高水平本地模式信息。在本文中,我们提出了一个通用的双流网络(DS-Net),以充分探索图像分类的本地和全球模式特征的代表性能力。我们的DS-Net可以同时计算细微和集成的特性,并有效地结合这些特性。具体地说,我们提议了一个内部推进模块,以处理每个区块的2个不同分辨率,以及一个跨系统协调模块,以便在两个尺度上进行信息互动。此外,我们还设计了一个双流 FPN(DS-FPN)双流网络,以进一步加强下游密集预测的背景资料。如果没有钟声和哨音,拟议的DS-Net将DeiT-Small比DiT-Small高出2.4%,在图像Net-1k的顶级和集级精度上,实现与其他愿景变异的状态表现。关于目标探测和实例分解,DS-Net-Smarall分别将Res-50比ResNet(DS-FSP-50)比对下层预测环境预测,将很快显示MASAP总蓝图的版本。

0

相关内容

表示容量

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

25+阅读 · 2020年7月28日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货】深度学习视觉跟踪:论文最新综述，23页pdf，Deep Learning for Visual Tracking: A Comprehensive Survey

【干货】深度学习视觉跟踪:论文最新综述，23页pdf，Deep Learning for Visual Tracking: A Comprehensive Survey

专知会员服务

58+阅读 · 2019年12月2日

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

专知会员服务

10+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知

16+阅读 · 2020年5月31日

CVPR2019年热门论文及开源代码分享

CVPR2019年热门论文及开源代码分享

深度学习与NLP

7+阅读 · 2019年6月3日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019 | 全景分割：Attention-guided Unified Network

CVPR2019 | 全景分割：Attention-guided Unified Network

极市平台

9+阅读 · 2019年3月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

哇~这么Deep且又轻量的Network，实时目标检测

哇~这么Deep且又轻量的Network，实时目标检测

计算机视觉战队

7+阅读 · 2018年8月15日

已删除

将门创投

5+阅读 · 2017年11月20日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

Multi-Branch with Attention Network for Hand-Based Person Recognition

Arxiv

0+阅读 · 2021年12月26日

ResT: An Efficient Transformer for Visual Recognition

Arxiv

3+阅读 · 2021年10月14日

Involution: Inverting the Inherence of Convolution for Visual Recognition

Arxiv

6+阅读 · 2021年3月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Gated Channel Transformation for Visual Recognition

Arxiv

4+阅读 · 2020年3月27日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation

Arxiv

3+阅读 · 2018年10月2日

ECO: Efficient Convolutional Network for Online Video Understanding

Arxiv

5+阅读 · 2018年5月7日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

25+阅读 · 2020年7月28日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【视频目标检测与跟踪：综述论文】Video Object Segmentation and Tracking: A Survey

专知会员服务

66+阅读 · 2020年6月4日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【干货】深度学习视觉跟踪:论文最新综述，23页pdf，Deep Learning for Visual Tracking: A Comprehensive Survey

【干货】深度学习视觉跟踪:论文最新综述，23页pdf，Deep Learning for Visual Tracking: A Comprehensive Survey

专知会员服务

58+阅读 · 2019年12月2日

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

专知会员服务

10+阅读 · 2019年11月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《陆军战斗操练中的关键事件诊断》

《自适应训练辅助概念及其在空战管理员加速训练中的应用导论》最新126页

军事通信市场七大趋势概述

《抗干扰无人机蜂群行为的遗传算法方法》

相关资讯

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知

16+阅读 · 2020年5月31日

CVPR2019年热门论文及开源代码分享

CVPR2019年热门论文及开源代码分享

深度学习与NLP

7+阅读 · 2019年6月3日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019 | 全景分割：Attention-guided Unified Network

CVPR2019 | 全景分割：Attention-guided Unified Network

极市平台

9+阅读 · 2019年3月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

哇~这么Deep且又轻量的Network，实时目标检测

哇~这么Deep且又轻量的Network，实时目标检测

计算机视觉战队

7+阅读 · 2018年8月15日

已删除

将门创投

5+阅读 · 2017年11月20日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

Multi-Branch with Attention Network for Hand-Based Person Recognition

Arxiv

0+阅读 · 2021年12月26日

ResT: An Efficient Transformer for Visual Recognition

Arxiv

3+阅读 · 2021年10月14日

Involution: Inverting the Inherence of Convolution for Visual Recognition

Arxiv

6+阅读 · 2021年3月10日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Gated Channel Transformation for Visual Recognition

Arxiv

4+阅读 · 2020年3月27日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation

Arxiv

3+阅读 · 2018年10月2日

ECO: Efficient Convolutional Network for Online Video Understanding

Arxiv

5+阅读 · 2018年5月7日

Reconstruction Network for Video Captioning

Arxiv

5+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员