LiDARs and cameras are the two main sensors that are planned to be included in many announced autonomous vehicles prototypes. Each of the two provides a unique form of data from a different perspective to the surrounding environment. In this paper, we explore and attempt to answer the question: is there an added benefit by fusing those two forms of data for the purpose of semantic segmentation within the context of autonomous driving? We also attempt to show at which level does said fusion prove to be the most useful. We evaluated our algorithms on the publicly available SemanticKITTI dataset. All fusion models show improvements over the base model, with the mid-level fusion showing the highest improvement of 2.7% in terms of mean Intersection over Union (mIoU) metric.
翻译:激光雷达和照相机是计划列入许多已宣布的自主车辆原型的两个主要传感器。 这两种传感器都从与周围环境不同的角度提供了独特的数据形式。 在本文中,我们探讨并试图解答以下问题:用这两种数据在自主驾驶的背景下将这两类数据叠装以达到语义分解的目的是否还有好处? 我们还试图表明,在哪个级别上,上述聚合证明是最有用的。 我们在公开提供的SemanticticKITTI数据集中评估了我们的算法。所有聚合模型都显示基础模型的改进,而中层聚合显示,在中间界间对Union(MIOU)的平均值方面,2.7%的改进最大。