按轻量度时时时不确定估计值改进视频实例分类 (Improving Video Instance Segmentation by Light-weight Temporal Uncertainty Estimates)

Instance segmentation with neural networks is an essential task in environment perception. In many works, it has been observed that neural networks can predict false positive instances with high confidence values and true positives with low ones. Thus, it is important to accurately model the uncertainties of neural networks in order to prevent safety issues and foster interpretability. In applications such as automated driving, the reliability of neural networks is of highest interest. In this paper, we present a time-dynamic approach to model uncertainties of instance segmentation networks and apply this to the detection of false positives as well as the estimation of prediction quality. The availability of image sequences in online applications allows for tracking instances over multiple frames. Based on an instances history of shape and uncertainty information, we construct temporal instance-wise aggregated metrics. The latter are used as input to post-processing models that estimate the prediction quality in terms of instance-wise intersection over union. The proposed method only requires a readily trained neural network (that may operate on single frames) and video sequence input. In our experiments, we further demonstrate the use of the proposed method by replacing the traditional score value from object detection and thereby improving the overall performance of the instance segmentation network.

翻译：神经网络与神经网络的发生分离是环境认知方面的一项基本任务。在许多工程中,人们发现神经网络可以预测假的正面事件,具有很高的自信值,而低的则具有真正的正数。因此,必须准确地模拟神经网络的不确定性,以防止安全问题和促进解释性。在自动驾驶等应用中,神经网络的可靠性是最为有意义的。在本文件中,我们提出了一个时间动态方法,用于模拟实例分离网络的不确定性,并将这一方法应用于检测假的正数和预测质量。在线应用中图像序列的可用性可以跟踪多个框架的情况。基于形状和不确定性信息的历史,我们根据时间实例构建了综合指标。后者被用作后处理模型的投入,用以根据实例的相互交错来估计预测质量。在本文中,拟议方法只需要一个训练有素的神经网络(可在单一框架上运行)和视频序列输入。在我们的实验中,我们进一步展示了拟议方法的使用情况,即从物体检测中取代传统分数值,从而改进了实例分割网络的总体性。