OA-BEV:将物体意识带入鸟类-Eye-视觉代表处,用于多相3D天体探测 (OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection) - 专知论文

会员服务 ·

0

3D · 目标检测 · Networking · 表示 · Extensibility ·

2023 年 1 月 30 日

OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for Multi-Camera 3D Object Detection

翻译：OA-BEV:将物体意识带入鸟类-Eye-视觉代表处,用于多相3D天体探测

Xiaomeng Chu,Jiajun Deng,Yuan Zhao,Jianmin Ji,Yu Zhang,Houqiang Li,Yanyong Zhang

The recent trend for multi-camera 3D object detection is through the unified bird's-eye view (BEV) representation. However, directly transforming features extracted from the image-plane view to BEV inevitably results in feature distortion, especially around the objects of interest, making the objects blur into the background. To this end, we propose OA-BEV, a network that can be plugged into the BEV-based 3D object detection framework to bring out the objects by incorporating object-aware pseudo-3D features and depth features. Such features contain information about the object's position and 3D structures. First, we explicitly guide the network to learn the depth distribution by object-level supervision from each 3D object's center. Then, we select the foreground pixels by a 2D object detector and project them into 3D space for pseudo-voxel feature encoding. Finally, the object-aware depth features and pseudo-voxel features are incorporated into the BEV representation with a deformable attention mechanism. We conduct extensive experiments on the nuScenes dataset to validate the merits of our proposed OA-BEV. Our method achieves consistent improvements over the BEV-based baselines in terms of both average precision and nuScenes detection score. Our codes will be published.

翻译：多相机 3D 对象探测的最近趋势是通过鸟眼统一视图(BEV) 显示。但是, 从图像- 平面视图中提取的特征直接转换为 BEV 必然导致地貌扭曲, 特别是对象周围的特征扭曲, 使对象模糊到背景中。为此, 我们提议 OA- BEV, 一个可以插入基于 BEV 的 3D 对象探测框架的网络, 以纳入对象认知伪3D 特征和深度特征的方式将对象引入 BEV 显示。这些特征包含关于对象位置和 3D 结构的信息。首先, 我们明确指导网络从每个 3D 对象中心通过目标级别监督来学习深度分布。然后, 我们通过 2D 对象探测器选择地表像素, 并将它们投放到 3D 空间, 用于伪vox 特性编码。最后, 对象认知深度特征和伪vox 特征以可变的注意机制纳入 BEV 表示方式。我们通过 NEV 进行广泛的实验, 来验证我们所公布的OA- BS- CEV 平均测算基准参数的优点。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

miR171调控柑橘愈伤组织体细胞胚发生的功能解析

国家自然科学基金

0+阅读 · 2015年12月31日

CCN6通过IGF-1通路调节软骨基质代谢平衡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

中国广义吴茱萸属（芸香科）的分类修订

国家自然科学基金

0+阅读 · 2012年12月31日

活化的PLC-γ及与Akt关联调控OA软骨基质代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

干旱半干旱地区农业干旱时空信息演化规律及其对气候变化的响应研究- - 以宁夏为例

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA调控血小板整合素αIIbβ3信号转导和骨架蛋白重构及其在冠心病血瘀证发病中的机制

国家自然科学基金

0+阅读 · 2012年12月31日

PCOS患者卵巢颗粒细胞对卵子及早期胚胎发育潜能的基因调控

国家自然科学基金

0+阅读 · 2011年12月31日

新型NdAlO3/SiO2 高k栅堆栈结构的实现与性能评估

国家自然科学基金

0+阅读 · 2009年12月31日

耐辐射球菌DNA损伤修复蛋白质RecQ的HRDC结构域结构和功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

肾康丸对糖尿病肾病大鼠miR-192介导通路的影响

国家自然科学基金

1+阅读 · 2009年12月31日

Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

Arxiv

0+阅读 · 2023年3月20日

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

Arxiv

0+阅读 · 2023年3月20日

VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection

Arxiv

0+阅读 · 2023年3月20日

TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving

Arxiv

0+阅读 · 2023年3月17日

PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Arxiv

0+阅读 · 2023年3月16日

SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation

Arxiv

0+阅读 · 2023年3月16日

Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video

Arxiv

0+阅读 · 2023年3月16日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Image Manipulation Detection by Multi-View Multi-Scale Supervision

Arxiv

13+阅读 · 2021年7月25日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

Arxiv

0+阅读 · 2023年3月20日

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors

Arxiv

0+阅读 · 2023年3月20日

VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection

Arxiv

0+阅读 · 2023年3月20日

TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving

Arxiv

0+阅读 · 2023年3月17日

PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Arxiv

0+阅读 · 2023年3月16日

SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation

Arxiv

0+阅读 · 2023年3月16日

Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video

Arxiv

0+阅读 · 2023年3月16日

Cross-Modal Object Tracking: Modality-Aware Representations and A Unified Benchmark

Arxiv

14+阅读 · 2021年11月11日

Image Manipulation Detection by Multi-View Multi-Scale Supervision

Arxiv

13+阅读 · 2021年7月25日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

相关基金

miR171调控柑橘愈伤组织体细胞胚发生的功能解析

国家自然科学基金

0+阅读 · 2015年12月31日

CCN6通过IGF-1通路调节软骨基质代谢平衡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

中国广义吴茱萸属（芸香科）的分类修订

国家自然科学基金

0+阅读 · 2012年12月31日

活化的PLC-γ及与Akt关联调控OA软骨基质代谢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

干旱半干旱地区农业干旱时空信息演化规律及其对气候变化的响应研究- - 以宁夏为例

国家自然科学基金

0+阅读 · 2012年12月31日

miRNA调控血小板整合素αIIbβ3信号转导和骨架蛋白重构及其在冠心病血瘀证发病中的机制

国家自然科学基金

0+阅读 · 2012年12月31日

PCOS患者卵巢颗粒细胞对卵子及早期胚胎发育潜能的基因调控

国家自然科学基金

0+阅读 · 2011年12月31日

新型NdAlO3/SiO2 高k栅堆栈结构的实现与性能评估

国家自然科学基金

0+阅读 · 2009年12月31日

耐辐射球菌DNA损伤修复蛋白质RecQ的HRDC结构域结构和功能研究

国家自然科学基金

0+阅读 · 2009年12月31日

肾康丸对糖尿病肾病大鼠miR-192介导通路的影响

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员