Object detection in 3D point clouds is a crucial task in a range of computer vision applications including robotics, autonomous cars, and augmented reality. This work addresses the object detection task in 3D point clouds using a highly efficient, surface-biased, feature extraction method (wang2022rbgnet), that also captures contextual cues on multiple levels. We propose a 3D object detector that extracts accurate feature representations of object candidates and leverages self-attention on point patches, object candidates, and on the global scene in 3D scene. Self-attention is proven to be effective in encoding correlation information in 3D point clouds by (xie2020mlcvnet). While other 3D detectors focus on enhancing point cloud feature extraction by selectively obtaining more meaningful local features (wang2022rbgnet) where contextual information is overlooked. To this end, the proposed architecture uses ray-based surface-biased feature extraction and multi-level context encoding to outperform the state-of-the-art 3D object detector. In this work, 3D detection experiments are performed on scenes from the ScanNet dataset whereby the self-attention modules are introduced one after the other to isolate the effect of self-attention at each level.
翻译:在3D点云中检测3D点天体是一系列计算机视觉应用应用中的一项关键任务,包括机器人、自主汽车和增强现实。这项工作利用高效、地平面和地貌提取方法(Wang2022rbgnet)处理3D点云的天体探测任务,这种方法也捕捉多个层次的上下文线索。我们提议3D天体探测器,该探测器可以提取对象候选人的准确特征表示,并在3D场点、对象候选人和全球场景上利用自我关注。自我注意被证明在3D点云(xie2020mlcvnet)对相关信息进行编码方面是有效的。虽然其他3D探测器侧重于通过有选择地获取更有意义的本地特征(Wang2022rbgnet)来加强点云体积提取工作,从而在忽略了背景信息的情况下有选择地获取更有意义的本地特征(Wang2022rbgnet) 。为此,拟议建筑使用了基于光线地势的地势地貌特征提取和多层环境编码,以超越了3D点天体探测器的状态。在这项工作中,在扫描网格上进行了3D探测实验,从每个自控离离离离离离离离层的图像的图像舱层进行。