We propose a new attention mechanism, called Global Hierarchical Attention (GHA), for 3D point cloud analysis. GHA approximates the regular global dot-product attention via a series of coarsening and interpolation operations over multiple hierarchy levels. The advantage of GHA is two-fold. First, it has linear complexity with respect to the number of points, enabling the processing of large point clouds. Second, GHA inherently possesses the inductive bias to focus on spatially close points, while retaining the global connectivity among all points. Combined with a feedforward network, GHA can be inserted into many existing network architectures. We experiment with multiple baseline networks and show that adding GHA consistently improves performance across different tasks and datasets. For the task of semantic segmentation, GHA gives a +1.7% mIoU increase to the MinkowskiEngine baseline on ScanNet. For the 3D object detection task, GHA improves the CenterPoint baseline by +0.5% mAP on the nuScenes dataset, and the 3DETR baseline by +2.1% mAP25 and +1.5% mAP50 on ScanNet.
翻译:我们提出一个新的关注机制,称为全球等级注意(GHA),用于3D点云分析。 GHA通过在多个等级层次上的一系列粗化和内插操作来接近全球常规点产品关注。 GHA的优势是双重的。 首先,GHA的优势是双倍的。 它在点数方面具有线性复杂性,能够处理大点云。 其次, GHA本身具有侧重于空间近地点的感应偏差,同时保留所有点之间的全球连接。 GHA与一个向前网络结合,可以将GHA插入许多现有的网络结构中。 我们试验了多个基线网络,并表明加入GHA会不断提高不同任务和数据集的性能。 对于语义分解的任务, GHA给扫描网上的MinkowskiEngine基线增加了+1.7% moU。 关于3D对象探测任务, GHAHA将核心点基线在 nuScenes数据集上改进了0.5% mAP, 3DETR基准由+2.1% AP25 和 +1.5% mAP在扫描网上改进。