Point clouds are a key modality used for perception in autonomous vehicles, providing the means for a robust geometric understanding of the surrounding environment. However despite the sensor outputs from autonomous vehicles being naturally temporal in nature, there is still limited exploration of exploiting point cloud sequences for 3D seman-tic segmentation. In this paper we propose a novel Sparse Temporal Local Attention (STELA) module which aggregates intermediate features from a local neighbourhood in previous point cloud frames to provide a rich temporal context to the decoder. Using the sparse local neighbourhood enables our approach to gather features more flexibly than those which directly match point features, and more efficiently than those which perform expensive global attention over the whole point cloud frame. We achieve a competitive mIoU of 64.3% on the SemanticKitti dataset, and demonstrate significant improvement over the single-frame baseline in our ablation studies.
翻译:点云是自主车辆感知的主要模式,为对周围环境进行强力几何理解提供了手段。然而,尽管自主车辆的感应输出具有自然时间性,但对利用点云序列进行三维地震分块的探索仍然有限。 在本文中,我们提议了一个新型的“时空局部注意”模块,将前点云框中一个当地居民区的中间特征汇总起来,为解码器提供丰富的时间环境。 利用稀少的本地邻居,我们的方法能够比直接匹配点特征的特征更灵活地收集特征,比在整个点云框上进行昂贵全球关注的特征更有效。 我们在Semantic Kitti数据集上实现了64.3%的竞争性MIOU,并展示了我们消化研究中单一框架基线的显著改进。