VIDT: 高效和有效的完全变换基于物体的物体探测器 (ViDT: An Efficient and Effective Fully Transformer-based Object Detector) - 专知论文

会员服务 ·

0

变换 · Vision · Extensibility · Performer · Swin Transformer ·

2021 年 11 月 29 日

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

翻译：VIDT: 高效和有效的完全变换基于物体的物体探测器

Hwanjun Song,Deqing Sun,Sanghyuk Chun,Varun Jampani,Dongyoon Han,Byeongho Heo,Wonjae Kim,Ming-Hsuan Yang

from arxiv, This is a revised version on Nov. 29

Transformers are transforming the landscape of computer vision, especially for recognition tasks. Detection transformers are the first fully end-to-end learning systems for object detection, while vision transformers are the first fully transformer-based architecture for image classification. In this paper, we integrate Vision and Detection Transformers (ViDT) to build an effective and efficient object detector. ViDT introduces a reconfigured attention module to extend the recent Swin Transformer to be a standalone object detector, followed by a computationally efficient transformer decoder that exploits multi-scale features and auxiliary techniques essential to boost the detection performance without much increase in computational load. Extensive evaluation results on the Microsoft COCO benchmark dataset demonstrate that ViDT obtains the best AP and latency trade-off among existing fully transformer-based object detectors, and achieves 49.2AP owing to its high scalability for large models. We will release the code and trained models at https://github.com/naver-ai/vidt

翻译：检测变压器是第一个完全端到端的物体探测学习系统,而视觉变压器则是第一个完全变压器的图像分类结构。在本文中,我们整合了视觉和探测变压器(VIDT),以建立一个有效和高效的物体探测器。 VIDT引入了重新配置的注意模块,将最近的Swin变压器扩大为独立的物体探测器,随后是计算效率高的变压器解码器,利用多种规模的变压器和辅助技术来提高探测性能,而不会大大增加计算负荷。微软COCO基准数据集的广泛评价结果表明,VIDT在现有的完全变压器物体探测器中获得了最佳的AP和耐用量交换,并实现了49.2AP,因为它对大型模型具有高度的可变性能。我们将在 https://github.com/naver-ai/vidt发布代码和经过培训的模型。我们将在 https://github. com/naver-i/vidt。

0

相关内容

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

专知会员服务

13+阅读 · 2021年12月31日

【CVPR2021】用Transformers无监督预训练进行目标检测

【CVPR2021】用Transformers无监督预训练进行目标检测

专知会员服务

58+阅读 · 2021年3月3日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

18+阅读 · 2020年8月23日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

命名实体识别新SOTA：改进Transformer模型

命名实体识别新SOTA：改进Transformer模型

AI科技评论

17+阅读 · 2019年11月26日

已删除

将门创投

4+阅读 · 2019年4月1日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

专知

17+阅读 · 2018年4月11日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

Arxiv

0+阅读 · 2022年1月28日

Anchor DETR: Query Design for Transformer-Based Detector

Arxiv

5+阅读 · 2021年9月15日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

CornerNet-Lite: Efficient Keypoint Based Object Detection

CornerNet-Lite: Efficient Keypoint Based Object Detection

Arxiv

3+阅读 · 2019年4月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

FoveaBox: Beyond Anchor-based Object Detector

Arxiv

5+阅读 · 2019年4月8日

Video Object Detection with an Aligned Spatial-Temporal Memory

Video Object Detection with an Aligned Spatial-Temporal Memory

Arxiv

4+阅读 · 2018年7月27日

Receptive Field Block Net for Accurate and Fast Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

Arxiv

3+阅读 · 2018年7月26日

VIP会员

文章信息

相关主题

Swin Transformer

相关VIP内容

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

专知会员服务

13+阅读 · 2021年12月31日

【CVPR2021】用Transformers无监督预训练进行目标检测

【CVPR2021】用Transformers无监督预训练进行目标检测

专知会员服务

58+阅读 · 2021年3月3日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

【ECCV2020】EfficientFCN：语义分割中的整体引导解码器

专知会员服务

18+阅读 · 2020年8月23日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

人工智能赋能自主武器与人类控制第一部分：人类控制与机器学习的设计和开发 | 46页

军事指挥控制系统：2025年5种用途

人工智能赋能自主武器与人类控制第二部分：人类控制与军事指挥官 | 38页

相关资讯

命名实体识别新SOTA：改进Transformer模型

命名实体识别新SOTA：改进Transformer模型

AI科技评论

17+阅读 · 2019年11月26日

已删除

将门创投

4+阅读 · 2019年4月1日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

【论文推荐】最新九篇目标检测相关论文—常识性知识转移、尺度不敏感、多尺度位置感知、渐进式域适应、时间感知特征图、人机合作

专知

17+阅读 · 2018年4月11日

语义分割+视频分割开源代码集合

语义分割+视频分割开源代码集合

极市平台

35+阅读 · 2018年3月5日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

Arxiv

0+阅读 · 2022年1月28日

Anchor DETR: Query Design for Transformer-Based Detector

Arxiv

5+阅读 · 2021年9月15日

End-to-End Object Detection with Fully Convolutional Network

Arxiv

9+阅读 · 2021年3月26日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Arxiv

17+阅读 · 2020年3月31日

CornerNet-Lite: Efficient Keypoint Based Object Detection

CornerNet-Lite: Efficient Keypoint Based Object Detection

Arxiv

3+阅读 · 2019年4月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

FoveaBox: Beyond Anchor-based Object Detector

Arxiv

5+阅读 · 2019年4月8日

Video Object Detection with an Aligned Spatial-Temporal Memory

Video Object Detection with an Aligned Spatial-Temporal Memory

Arxiv

4+阅读 · 2018年7月27日

Receptive Field Block Net for Accurate and Fast Object Detection

Receptive Field Block Net for Accurate and Fast Object Detection

Arxiv

3+阅读 · 2018年7月26日

微信扫码咨询专知VIP会员