Transformer-based stereo-aware 3D object detection from binocular images - 专知论文

会员服务 ·

0

INFORMS · 查准率/准确率 · 3D · 变换 · 目标检测 ·

2023 年 4 月 24 日

Transformer-based stereo-aware 3D object detection from binocular images

翻译：暂无翻译

Hanqing Sun,Yanwei Pang,Jiale Cao,Jin Xie,Xuelong Li

Vision Transformers have shown promising progress in various object detection tasks, including monocular 2D/3D detection and surround-view 3D detection. However, when used in essential and classic stereo 3D object detection, directly adopting those surround-view Transformers leads to slow convergence and significant precision drops. We argue that one of the causes of this defect is that the surround-view Transformers do not consider the stereo-specific image correspondence information. In a surround-view system, the overlapping areas are small, and thus correspondence is not a primary issue. In this paper, we explore the model design of vision Transformers in stereo 3D object detection, focusing particularly on extracting and encoding the task-specific image correspondence information. To achieve this goal, we present TS3D, a Transformer-based Stereo-aware 3D object detector. In the TS3D, a Disparity-Aware Positional Encoding (DAPE) model is proposed to embed the image correspondence information into stereo features. The correspondence is encoded as normalized disparity and is used in conjunction with sinusoidal 2D positional encoding to provide the location information of the 3D scene. To extract enriched multi-scale stereo features, we propose a Stereo Reserving Feature Pyramid Network (SRFPN). The SRFPN is designed to reserve the correspondence information while fusing intra-scale and aggregating cross-scale stereo features. Our proposed TS3D achieves a 41.29% Moderate Car detection average precision on the KITTI test set and takes 88 ms to detect objects from each binocular image pair. It is competitive with advanced counterparts in terms of both precision and inference speed.

翻译：暂无翻译

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

CVPR2019| 9篇CVPR论文开源代码（行人检测/物体检测/3D Face等）

CVPR2019| 9篇CVPR论文开源代码（行人检测/物体检测/3D Face等）

极市平台

12+阅读 · 2019年5月31日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

专知

74+阅读 · 2018年1月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

航空叶片多光学传感器多尺度测量点云高效拼合方法

国家自然科学基金

0+阅读 · 2015年12月31日

（氧）氮化物光电极太阳能分解水制氢的研究

国家自然科学基金

0+阅读 · 2014年12月31日

雌激素上调酸敏感离子通道：疼痛性别差异的一个新分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

面向拥挤监控场景的异常事件检测技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

SUMO/DeSUMO化修饰在抑制性受体膜转运中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

内质网应激参与镉诱发的植物细胞程序性死亡的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

基于不变性知觉的双目视觉注意机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

DSPDet3D: Dynamic Spatial Pruning for 3D Small Object Detection

Arxiv

0+阅读 · 2023年6月5日

Introducing Depth into Transformer-based 3D Object Detection

Arxiv

0+阅读 · 2023年6月5日

Object as Query: Lifting any 2D Object Detector to 3D Detection

Arxiv

0+阅读 · 2023年6月5日

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

Arxiv

0+阅读 · 2023年6月2日

Backchannel Detection and Agreement Estimation from Video with Transformer Networks

Arxiv

0+阅读 · 2023年6月2日

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

Arxiv

0+阅读 · 2023年6月2日

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

Arxiv

0+阅读 · 2023年6月2日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

查准率/准确率

相关VIP内容

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

【CVPR2020】通过潦草注释的弱监督显著目标检测，Weakly-Supervised Salient Object Detection via Scribble Annotations

专知会员服务

39+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

CVPR2019| 9篇CVPR论文开源代码（行人检测/物体检测/3D Face等）

CVPR2019| 9篇CVPR论文开源代码（行人检测/物体检测/3D Face等）

极市平台

12+阅读 · 2019年5月31日

CVPR2019 | Stereo R-CNN 3D 目标检测

CVPR2019 | Stereo R-CNN 3D 目标检测

极市平台

27+阅读 · 2019年3月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

【论文推荐】最新5篇目标检测相关论文——显著目标检测、弱监督One-Shot检测、多框检测器、携带物体检测、假彩色图像检测

专知

74+阅读 · 2018年1月16日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

相关论文

DSPDet3D: Dynamic Spatial Pruning for 3D Small Object Detection

Arxiv

0+阅读 · 2023年6月5日

Introducing Depth into Transformer-based 3D Object Detection

Arxiv

0+阅读 · 2023年6月5日

Object as Query: Lifting any 2D Object Detector to 3D Detection

Arxiv

0+阅读 · 2023年6月5日

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

Arxiv

0+阅读 · 2023年6月2日

Backchannel Detection and Agreement Estimation from Video with Transformer Networks

Arxiv

0+阅读 · 2023年6月2日

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

Arxiv

0+阅读 · 2023年6月2日

SSD-MonoDETR: Supervised Scale-aware Deformable Transformer for Monocular 3D Object Detection

Arxiv

0+阅读 · 2023年6月2日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

3D Backbone Network for 3D Object Detection

Arxiv

12+阅读 · 2019年1月24日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

相关基金

航空叶片多光学传感器多尺度测量点云高效拼合方法

国家自然科学基金

0+阅读 · 2015年12月31日

（氧）氮化物光电极太阳能分解水制氢的研究

国家自然科学基金

0+阅读 · 2014年12月31日

雌激素上调酸敏感离子通道：疼痛性别差异的一个新分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

面向拥挤监控场景的异常事件检测技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

SUMO/DeSUMO化修饰在抑制性受体膜转运中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

内质网应激参与镉诱发的植物细胞程序性死亡的机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于局部不变性特征流的相异场景密集匹配

国家自然科学基金

0+阅读 · 2011年12月31日

基于不变性知觉的双目视觉注意机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员