DAB-DETR: 动态锁定盒对DETR的查询更好 (DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR) - 专知论文

会员服务 ·

0

anchor · Performer · Better · Extensibility · SOFT ·

2022 年 1 月 28 日

DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR

翻译：DAB-DETR: 动态锁定盒对DETR的查询更好

Shilong Liu,Feng Li,Hao Zhang,Xiao Yang,Xianbiao Qi,Hang Su,Jun Zhu,Lei Zhang

We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods. Code is available at \url{https://github.com/SlongLiu/DAB-DETR}.

翻译：在本文中,我们提出了一个使用DETR(DETR)动态定位箱的新查询配方,对查询在DETR(DETR)中的作用有更深入的了解。这种新配方直接使用箱式坐标作为变换器解码器的查询,并逐层动态地更新这些查询。使用箱式坐标不仅有助于使用明确的定位前缀来改进查询到功能的相似性,消除DETR(DETR)中缓慢的培训趋同问题,而且还使我们能够利用箱宽度和高度信息调整位置注意图。这种设计清楚地表明,DETR中的查询可以按级次级进行软ROI集合层的查询。结果,它导致在同一环境下,例如,AP 45.7-用ResNet50-DC5作为主干线,在50个小区进行培训。我们还进行了广泛的实验,以确认我们的分析并核实我们的方法的有效性。代码可在\url{https://github.com/SlongLu/DAB}查询模型中查到。

0

相关内容

anchor

【CVPR 2022】基于windows的图像压缩注意，The Devil Is in the Details: Window-based Attention for Image Compression

【CVPR 2022】基于windows的图像压缩注意，The Devil Is in the Details: Window-based Attention for Image Compression

专知会员服务

8+阅读 · 2022年3月12日

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

专知会员服务

13+阅读 · 2021年12月31日

【CVPR2021】用Transformers无监督预训练进行目标检测

【CVPR2021】用Transformers无监督预训练进行目标检测

专知会员服务

58+阅读 · 2021年3月3日

新杀器来了！Facebook AI提出DETR：用Transformers来进行端到端的目标检测

新杀器来了！Facebook AI提出DETR：用Transformers来进行端到端的目标检测

专知会员服务

51+阅读 · 2020年5月28日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

专知

52+阅读 · 2018年6月28日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于本体的GPS信息语义互操作方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于视觉显著性和稀疏表示的图像质量评价

国家自然科学基金

1+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

信息物理融合的Web对象可视检索技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

Modeling Missing Annotations for Incremental Learning in Object Detection

Arxiv

0+阅读 · 2022年4月19日

Multimodal Token Fusion for Vision Transformers

Arxiv

3+阅读 · 2022年4月19日

Point-Level Region Contrast for Object Detection Pre-Training

Arxiv

1+阅读 · 2022年4月19日

Salient Objects in Clutter

Salient Objects in Clutter

Arxiv

0+阅读 · 2022年4月18日

TubeR: Tubelet Transformer for Video Action Detection

Arxiv

0+阅读 · 2022年4月15日

End-to-End Sensitivity-Based Filter Pruning

Arxiv

0+阅读 · 2022年4月15日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于windows的图像压缩注意，The Devil Is in the Details: Window-based Attention for Image Compression

【CVPR 2022】基于windows的图像压缩注意，The Devil Is in the Details: Window-based Attention for Image Compression

专知会员服务

8+阅读 · 2022年3月12日

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

【AAAI2022】锚点DETR：基于transformer检测器的查询设计

专知会员服务

13+阅读 · 2021年12月31日

【CVPR2021】用Transformers无监督预训练进行目标检测

【CVPR2021】用Transformers无监督预训练进行目标检测

专知会员服务

58+阅读 · 2021年3月3日

新杀器来了！Facebook AI提出DETR：用Transformers来进行端到端的目标检测

新杀器来了！Facebook AI提出DETR：用Transformers来进行端到端的目标检测

专知会员服务

51+阅读 · 2020年5月28日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

【论文推荐】最新八篇情感分析相关论文—注意力网络、多模态情感分析、情感分析局限性、跨语言情感分类、多语言情感分析

专知

52+阅读 · 2018年6月28日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

相关论文

Modeling Missing Annotations for Incremental Learning in Object Detection

Arxiv

0+阅读 · 2022年4月19日

Multimodal Token Fusion for Vision Transformers

Arxiv

3+阅读 · 2022年4月19日

Point-Level Region Contrast for Object Detection Pre-Training

Arxiv

1+阅读 · 2022年4月19日

Salient Objects in Clutter

Salient Objects in Clutter

Arxiv

0+阅读 · 2022年4月18日

TubeR: Tubelet Transformer for Video Action Detection

Arxiv

0+阅读 · 2022年4月15日

End-to-End Sensitivity-Based Filter Pruning

Arxiv

0+阅读 · 2022年4月15日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

相关基金

基于框架提升变换的多源图像融合研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于本体的GPS信息语义互操作方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

可积系统的代数与几何结构

国家自然科学基金

0+阅读 · 2013年12月31日

多观测量融合的水下被动目标跟踪方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

基于视觉显著性和稀疏表示的图像质量评价

国家自然科学基金

1+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

信息物理融合的Web对象可视检索技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于语义的图像合成

国家自然科学基金

0+阅读 · 2011年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员