The reliability assessment of a machine learning model's prediction is an important quantity for the deployment in safety critical applications. Not only can it be used to detect novel sceneries, either as out-of-distribution or anomaly sample, but it also helps to determine deficiencies in the training data distribution. A lot of promising research directions have either proposed traditional methods like Gaussian processes or extended deep learning based approaches, for example, by interpreting them from a Bayesian point of view. In this work we propose a novel approach for uncertainty estimation based on autoencoder models: The recursive application of a previously trained autoencoder model can be interpreted as a dynamical system storing training examples as attractors. While input images close to known samples will converge to the same or similar attractor, input samples containing unknown features are unstable and converge to different training samples by potentially removing or changing characteristic features. The use of dropout during training and inference leads to a family of similar dynamical systems, each one being robust on samples close to the training distribution but unstable on new features. Either the model reliably removes these features or the resulting instability can be exploited to detect problematic input samples. We evaluate our approach on several dataset combinations as well as on an industrial application for occupant classification in the vehicle interior for which we additionally release a new synthetic dataset.
翻译:机器学习模型预测的可靠性评估对于安全关键应用的部署来说是一个重要的数量。它不仅可以用来探测新的场景,无论是作为分配外或异常抽样,而且有助于确定培训数据分布中的缺陷。许多有希望的研究方向都提出了传统方法,如高山进程,或者从巴耶西亚的角度解释这些方法,从而扩大深层次学习。在这项工作中,我们提出了一个基于自动编码模型的不确定性估计新颖方法:以前训练过的自动编码模型的循环应用可被解释为一个动态系统,储存作为吸引者的培训范例。虽然与已知样品相近的投入图像将聚集到相同或类似的吸引者,但含有未知特征的输入样本不稳定,通过可能删除或改变特征,与不同的培训样本相融合。在培训和推断过程中,采用类似的动态系统,每个样本都与培训分布相近,但新特征不稳定。一些模型可靠地删除这些特征或由此产生的不稳定性可以被解释为一种动态系统,用来储存与已知样本相近的动态样本。我们用新的数据来进行内部分类,用以探测有问题的合成车辆分类。