In state-of-the-art deep learning for object recognition, SoftMax and Sigmoid functions are most commonly employed as the predictor outputs. Such layers often produce overconfident predictions rather than proper probabilistic scores, which can thus harm the decision-making of `critical' perception systems applied in autonomous driving and robotics. Given this, the experiments in this work propose a probabilistic approach based on distributions calculated out of the Logit layer scores of pre-trained networks. We demonstrate that Maximum Likelihood (ML) and Maximum a-Posteriori (MAP) functions are more suitable for probabilistic interpretations than SoftMax and Sigmoid-based predictions for object recognition. We explore distinct sensor modalities via RGB images and LiDARs (RV: range-view) data from the KITTI and Lyft Level-5 datasets, where our approach shows promising performance compared to the usual SoftMax and Sigmoid layers, with the benefit of enabling interpretable probabilistic predictions. Another advantage of the approach introduced in this paper is that the ML and MAP functions can be implemented in existing trained networks, that is, the approach benefits from the output of the Logit layer of pre-trained networks. Thus, there is no need to carry out a new training phase since the ML and MAP functions are used in the test/prediction phase.
翻译:在最先进的物体识别、 SoftMax 和 Sigmoid 功能的高级深层次学习中,最常用的功能是预测输出,这些层往往产生过度自信的预测,而不是适当的概率分数,从而可能损害在自主驾驶和机器人中应用的“关键”感知系统的决策。有鉴于此,这项工作的实验提出了一种基于从Logit层预培训网络分数中计算分布的概率性方法。我们证明,最大相似性(ML)和最大异质(MAP)功能比SoftMax和基于小类的预测更适合概率性解释,而不是用于目标识别的预测。我们通过RGB图像和LIDARs(RV:范围视图)数据探索不同的感知模式。我们的方法显示,与通常的 SoftMax 和 Sigmepreal 级相比,我们的方法表现良好,有利于进行可解释的概率预测。本文采用的方法的另一个优点是,从ML和MMAP 级现有测试阶段的功能,因此,ML 测试ML 级前的网络不需要再使用。