Object detection in three-dimensional (3D) space attracts much interest from academia and industry since it is an essential task in AI-driven applications such as robotics, autonomous driving, and augmented reality. As the basic format of 3D data, the point cloud can provide detailed geometric information about the objects in the original 3D space. However, due to 3D data's sparsity and unorderedness, specially designed networks and modules are needed to process this type of data. Attention mechanism has achieved impressive performance in diverse computer vision tasks; however, it is unclear how attention modules would affect the performance of 3D point cloud object detection and what sort of attention modules could fit with the inherent properties of 3D data. This work investigates the role of the attention mechanism in 3D point cloud object detection and provides insights into the potential of different attention modules. To achieve that, we comprehensively investigate classical 2D attentions, novel 3D attentions, including the latest point cloud transformers on SUN RGB-D and ScanNetV2 datasets. Based on the detailed experiments and analysis, we conclude the effects of different attention modules. This paper is expected to serve as a reference source for benefiting attention-embedded 3D point cloud object detection. The code and trained models are available at: https://github.com/ShiQiu0419/attentions_in_3D_detection.
翻译:三维(3D)空间的物体探测吸引了学术界和业界的极大兴趣,因为它是AI驱动应用(如机器人、自主驱动和增强现实)的一个基本任务。作为3D数据的基本格式,点云可以提供关于原始3D空间中物体的详细几何信息。然而,由于3D数据的偏狭性和无秩序性,需要专门设计的网络和模块来处理这类数据。注意机制在不同计算机愿景任务中取得了令人印象深刻的性能;然而,不清楚注意模块将如何影响3D点云对象探测的性能,以及何种关注模块可以与3D数据的内在特性相适应。这项工作调查了3D点云对象探测的注意机制的作用,并提供了对不同关注模块潜力的洞察力。为此,我们全面调查传统的2D关注点,新设计的网络和模块,包括SUN RGB-D和ScanNetV2数据集的最新点变云器。根据详细的实验和分析,我们完成了不同关注对象模块的影响。本文预计将在3D点/SHID3SIMD 中作为帮助关注的参考源。