VIDT: 高效和有效的完全变换基于物体的物体探测器 (ViDT: An Efficient and Effective Fully Transformer-based Object Detector) - 专知论文

会员服务 ·

0

变换 · Vision · Extensibility · Performer · Swin Transformer ·

2021 年 10 月 8 日

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

翻译：VIDT: 高效和有效的完全变换基于物体的物体探测器

Hwanjun Song,Deqing Sun,Sanghyuk Chun,Varun Jampani,Dongyoon Han,Byeongho Heo,Wonjae Kim,Ming-Hsuan Yang

Transformers are transforming the landscape of computer vision, especially for recognition tasks. Detection transformers are the first fully end-to-end learning systems for object detection, while vision transformers are the first fully transformer-based architecture for image classification. In this paper, we integrate Vision and Detection Transformers (ViDT) to build an effective and efficient object detector. ViDT introduces a reconfigured attention module to extend the recent Swin Transformer to be a standalone object detector, followed by a computationally efficient transformer decoder that exploits multi-scale features and auxiliary techniques essential to boost the detection performance without much increase in computational load. Extensive evaluation results on the Microsoft COCO benchmark dataset demonstrate that ViDT obtains the best AP and latency trade-off among existing fully transformer-based object detectors, and achieves 49.2AP owing to its high scalability for large models. We will release the code and trained models athttps://github.com/naver-ai/vidt

翻译：检测变压器是第一个完全端到端的物体探测学习系统,而视觉变压器则是第一个完全变压器的图像分类结构。在本文中,我们整合了视觉和探测变压器(VIDT),以建立一个有效和高效的物体探测器。 VIDT引入了重新配置的注意模块,将最近的Swin变压器扩大为独立的物体探测器,随后是计算效率高的变压器解码器,利用多种规模的变压器和辅助技术来提高探测性能,而不增加计算负荷。微软COCO基准数据集的广泛评价结果表明,VIDT在现有的完全变压器物体探测器中获得了最佳的AP和耐久权交换,并实现了49.2AP,因为大型模型的可变率很高。我们将在https://github.com/naver-ai/vidt发布代码和经过培训的模型。我们将在https://github. com/naver-vidt发布该模型。

0

相关内容

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

专知会员服务

34+阅读 · 2020年4月11日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

CVPR2019| 05-16更新10篇论文及代码合集（含一篇oral，全景分割/文本检测/目标检测等）

CVPR2019| 05-16更新10篇论文及代码合集（含一篇oral，全景分割/文本检测/目标检测等）

极市平台

14+阅读 · 2019年5月16日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

整合全部顶尖目标检测算法：FAIR开源Detectron

整合全部顶尖目标检测算法：FAIR开源Detectron

炼数成金订阅号

6+阅读 · 2018年1月25日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

Transformer-based Network for RGB-D Saliency Detection

Transformer-based Network for RGB-D Saliency Detection

Arxiv

0+阅读 · 2021年12月1日

ATS: Adaptive Token Sampling For Efficient Vision Transformers

Arxiv

0+阅读 · 2021年11月30日

Object-Region Video Transformers

Arxiv

0+阅读 · 2021年11月30日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

EfficientDet: Scalable and Efficient Object Detection

EfficientDet: Scalable and Efficient Object Detection

Arxiv

6+阅读 · 2019年11月20日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

SFA: Small Faces Attention Face Detector

Arxiv

4+阅读 · 2018年12月20日

Few-shot Object Detection via Feature Reweighting

Arxiv

7+阅读 · 2018年12月5日

VIP会员

文章信息

相关主题

Swin Transformer

相关VIP内容

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

【CVPR2020】实例感知、上下文聚焦和内存有效的弱监督目标检测，Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

专知会员服务

34+阅读 · 2020年4月11日

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

【Google AI新论文EfficientDet】规模化高效化的物体检测，EfficientDet: Scalable and Efficient Object Detection(附pdf)

专知会员服务

27+阅读 · 2019年11月24日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

【目标检测 | 2019最新综述】目标检测中的不平衡问题，附31页PDF， Imbalance Problems in Object Detection: A Review

专知会员服务

46+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

197+阅读 · 2019年10月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用大语言模型（LLM）优化海军陆战队经验教训学习》2025年最新103页

《加拿大陆军顶层作战概念》2025最新33页

超越第一人称视角（FPV）无人机：汲取俄乌战争的全部教训

《瓦洛伦斯（ValoRens）项目 - 预测分析：解读敌方意图》

相关资讯

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

CVPR2019| 05-16更新10篇论文及代码合集（含一篇oral，全景分割/文本检测/目标检测等）

CVPR2019| 05-16更新10篇论文及代码合集（含一篇oral，全景分割/文本检测/目标检测等）

极市平台

14+阅读 · 2019年5月16日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

人脸检测库：libfacedetection

人脸检测库：libfacedetection

Python程序员

15+阅读 · 2019年3月22日

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

Github项目推荐 | 比快更快！速度超越OpenCV的人脸检测库 libfacedetection 开源！

AI研习社

10+阅读 · 2019年3月18日

2018机器学习开源资源盘点

2018机器学习开源资源盘点

专知

6+阅读 · 2019年2月2日

论文笔记之Feature Selective Networks for Object Detection

论文笔记之Feature Selective Networks for Object Detection

统计学习与视觉计算组

21+阅读 · 2018年7月26日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

整合全部顶尖目标检测算法：FAIR开源Detectron

整合全部顶尖目标检测算法：FAIR开源Detectron

炼数成金订阅号

6+阅读 · 2018年1月25日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Transformer-based Network for RGB-D Saliency Detection

Transformer-based Network for RGB-D Saliency Detection

Arxiv

0+阅读 · 2021年12月1日

ATS: Adaptive Token Sampling For Efficient Vision Transformers

Arxiv

0+阅读 · 2021年11月30日

Object-Region Video Transformers

Arxiv

0+阅读 · 2021年11月30日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

EfficientDet: Scalable and Efficient Object Detection

EfficientDet: Scalable and Efficient Object Detection

Arxiv

6+阅读 · 2019年11月20日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

SFA: Small Faces Attention Face Detector

Arxiv

4+阅读 · 2018年12月20日

Few-shot Object Detection via Feature Reweighting

Arxiv

7+阅读 · 2018年12月5日

微信扫码咨询专知VIP会员