Feature learning for 3D object detection from point clouds is very challenging due to the irregularity of 3D point cloud data. In this paper, we propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively. Specifically, a Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level. A Global Transformer is designed to learn context-aware representations at the scene level. To further capture the dependencies among multi-scale representations, we propose Local-Global Transformer to integrate local features with global features from higher resolution. In addition, we introduce an efficient coordinate refinement module to shift down-sampled points closer to object centroids, which improves object proposal generation. We use Pointformer as the backbone for state-of-the-art object detection models and demonstrate significant improvements over original models on both indoor and outdoor datasets.
翻译:由于3D点云数据不规范,从点云中探测三维天体的特性学习非常具有挑战性。在本文件中,我们提议了3D点云数据的非常规性。我们提议了Pointfore,即为3D点云设计的一个变异主干网,以有效学习特征。具体地说,一个本地变异器模块用于模拟当地区域各点之间的相互作用,在物体一级学习环境独立的区域特征。一个全球变异器旨在学习场景一级的环境觉悟表征。为了进一步捕捉多尺度代表之间的依赖性,我们提议了地方-全球变异器,将地方特征与高分辨率的全球特征融合起来。此外,我们引入了一个高效的协调改进模块,将标出的点向更接近对象的圆形体转移,从而改进了对象建议的生成。我们用点模型作为最先进的天体探测模型的主干线,并展示了对室内和室外数据集的原始模型的重大改进。