Autonomous driving (AD) perception today relies heavily on deep learning based architectures requiring large scale annotated datasets with their associated costs for curation and annotation. The 3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D Segmentation (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain, including rural, urban, industrial sites and universities from 13 countries. It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds. We also propose a novel method for sequential dataset split generation based on iterative multi-label stratification, and demonstrated to achieve a +1.2% mIoU improvement over the original split proposed by SemanticKITTI dataset. A complete benchmark for semantic segmentation task was performed, with state of the art methods. Finally, we demonstrate an active learning (AL) based dataset distillation framework. We introduce a novel heuristic-free sampling method called distance sampling in the context of AL. A detailed presentation on the dataset is available at https://www.youtube.com/watch?v=5m6ALIs-s20 .
翻译:今天的自主驱动(AD)感知严重依赖基于深层次学习的架构,需要大规模附加注释的数据集及其相关的校正和批注成本。3D语义数据对核心认知任务,如障碍探测和自我汽车定位等核心认知任务有用。我们提出了一个新的数据集,Navya 3D分割(Navya3DSection)(Navya3DSection)(Navya3DSeg),其标签空间与大规模生产级操作领域(包括农村、城市、工业场所和13个国家的大学)相对应,包含23个标记的序列和25个补充序列,无标签,旨在探索点云上的自我监督半监督的语义分割基准。我们还提出了基于迭代多标签分立的顺序数据集生成新颖方法,并展示了SemmanticKITTI数据集的原有分立幅度+1.2% mIOU。完成了语系分立任务的全面基准,并实施了20种艺术方法。最后,我们展示了一种积极的学习(AL)基于数据集位的立和半受监督的语义分割框架。我们在远程取样中采用了新的HUR5=ALeximexexeximexexexexmex