The calibration of deep learning-based perception models plays a crucial role in their reliability. Our work focuses on a class-wise evaluation of several model's confidence performance for LiDAR-based semantic segmentation with the aim of providing insights into the calibration of underrepresented classes. Those classes often include VRUs and are thus of particular interest for safety reasons. With the help of a metric based on sparsification curves we compare the calibration abilities of three semantic segmentation models with different architectural concepts, each in a in deterministic and a probabilistic version. By identifying and describing the dependency between the predictive performance of a class and the respective calibration quality we aim to facilitate the model selection and refinement for safety-critical applications.
翻译:深层次的基于学习的认知模型的校准在可靠性方面发挥着关键作用。我们的工作重点是对基于LiDAR的语义分解的若干模型的可信度表现进行分级评估,目的是提供代表性不足的类别校准的洞察力。这些类别通常包括甚低级,因此出于安全原因特别值得注意。在基于透析曲线的衡量标准的帮助下,我们将三个语义分解模型的校准能力与不同的建筑概念进行比较,每个模式都具有确定性和概率性。通过确定和描述一个类的预测性能与相应的校准质量之间的依赖性,我们的目标是促进安全关键应用的示范选择和完善。