改进三维天体探测 (Attention-based Proposals Refinement for 3D Object Detection)

Safe autonomous driving technology heavily depends on accurate 3D object detection since it produces input to safety critical downstream tasks such as prediction and navigation. Recent advances in this field is made by developing the refinement stage for voxel-based region proposal networks to better strike the balance between accuracy and efficiency. A popular approach among state-of-the-art frameworks is to divide proposals, or Region of Interest (ROI), into grids and extract feature for each grid location before synthesizing them to ROI feature. While achieving impressive performances, such an approach involves a number of hand crafted components (e.g. grid sampling, set abstraction) which requires expert knowledge to be tuned correctly. This paper takes a more data-driven approach to ROI feature extraction using the attention mechanism. Specifically, points inside a ROI are positionally encoded to incorporate ROI 's geometry. The resulted position encoding and their features are transformed into ROI feature via vector attention. Unlike the original multi-head attention, vector attention assign different weights to different channels within a point feature, thus being able to capture a more sophisticated relation between pooled points and ROI. Experiments on KITTI \textit{validation} set show that our method achieves competitive performance of 84.84 AP for class Car at Moderate difficulty while having the least parameters compared to closely related methods and attaining a quasi-real time inference speed at 15 FPS on NVIDIA V100 GPU. The code will be released.

翻译：安全自主驾驶技术在很大程度上取决于准确的三维天体探测,因为它为预测和导航等关键的下游安全任务提供了投入。这一领域最近的进展是通过发展基于 voxel 的区域建议网络的完善阶段,以更好地平衡准确性和效率之间的平衡。最先进的框架中流行的方法是将建议或利益区域(ROI)分为网格和每个网格位置的提取功能,然后将其与ROI特征合成。这种方法虽然取得了令人印象深刻的性能,但涉及手动制作的一些部件(例如,电网取样、设定抽象),这需要正确调整专业知识。本文对使用关注机制的ROI特征提取采用了更注重数据的方法。具体地说,最先进的网络框架框架内各点已定位为纳入ROI的地理测量。由此产生的位置编码及其特征通过矢量关注转化为ROI特征。与最初的多头关注不同,病媒注意力在点特性下对不同的渠道分配不同重量,因此能够捕捉到集合点点点点与ROI之间更为复杂的关系。在最小的集合点点点和ROI 特征上,对ROI 特点采用了更注重的数据驱动方法。在比较的TRI 184 的进度方法上将显示我们最慢的运行方法的进度方法,同时进行。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【CVPR2022】自动驾驶中的伪双目三维目标检测，Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving

专知会员服务

18+阅读 · 2022年3月19日

【CVPR2020-Facebook】从检测到3D目标，FroDO: From Detections to 3D Objects

专知会员服务

33+阅读 · 2020年5月12日

CVPR 2020 论文开源项目合集

专知会员服务

110+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日