Range-view based LiDAR segmentation methods are attractive for practical applications due to their direct inheritance from efficient 2D CNN architectures. In literature, most range-view based methods follow the per-pixel classification paradigm. Recently, in the image segmentation domain, another paradigm formulates segmentation as a mask-classification problem and has achieved remarkable performance. This raises an interesting question: can the mask-classification paradigm benefit the range-view based LiDAR segmentation and achieve better performance than the counterpart per-pixel paradigm? To answer this question, we propose a unified mask-classification model, MaskRange, for the range-view based LiDAR semantic and panoptic segmentation. Along with the new paradigm, we also propose a novel data augmentation method to deal with overfitting, context-reliance, and class-imbalance problems. Extensive experiments are conducted on the SemanticKITTI benchmark. Among all published range-view based methods, our MaskRange achieves state-of-the-art performance with $66.10$ mIoU on semantic segmentation and promising results with $53.10$ PQ on panoptic segmentation with high efficiency. Our code will be released.
翻译:以测距为基础的LIDAR 分解方法对实际应用具有吸引力,因为它们直接继承了高效的 2D CNN 结构。在文献中,大多数以测距为基础的方法都遵循了每像素分类模式。最近,在图像分解领域,另一个模式将分解作为遮罩分解问题,并取得了显著的绩效。这提出了一个有趣的问题:遮罩分解模式能否使基于LIDAR 的测距分解法受益,并取得比对应的每像素范式更好的性能?为了回答这个问题,我们建议为基于LIDAR 的测距分解和全景分解法的测距模型,采用统一的遮光镜分级模型MaskRange。除了新的模式外,我们还提出一种新的数据增强方法,处理过度配制、环境依赖性和阶级平衡问题。在SmantictKITTI 基准上进行了广泛的实验。在所有以测距为基础的测距方法中,我们的MaskRange 以66.10美元的 musion-art 性分级化模型取得最新性表现,在语区划分解方面以53.10美元的 mIoU 和很有希望的结果。