We propose a method to estimate the uncertainty of the outcome of an image classifier on a given input datum. Deep neural networks commonly used for image classification are deterministic maps from an input image to an output class. As such, their outcome on a given datum involves no uncertainty, so we must specify what variability we are referring to when defining, measuring and interpreting "confidence." To this end, we introduce the Wellington Posterior, which is the distribution of outcomes that would have been obtained in response to data that could have been generated by the same scene that produced the given image. Since there are infinitely many scenes that could have generated the given image, the Wellington Posterior requires induction from scenes other than the one portrayed. We explore alternate methods using data augmentation, ensembling, and model linearization. Additional alternatives include generative adversarial networks, conditional prior networks, and supervised single-view reconstruction. We test these alternatives against the empirical posterior obtained by inferring the class of temporally adjacent frames in a video. These developments are only a small step towards assessing the reliability of deep network classifiers in a manner that is compatible with safety-critical applications.
翻译:我们建议一种方法来估计某个输入数据上的图像分类结果的不确定性。 用于图像分类常用的深神经网络是从一个输入图像到一个输出类的确定性地图。 因此, 在给定数据目录上的结果没有不确定性, 所以我们必须具体说明我们在定义、 测量和解释“ 信任” 时指的是什么变异性。 为此, 我们引入了 Wellington Posideor, 这是根据生成给定图像的同一场景本可生成的数据而获得的结果的分布。 由于可以生成给定图像的许多场景, 惠灵顿 Poserage 需要从所显示的场景以外的场进行感应。 我们探索了使用数据增强、 聚合和模型线性化的替代方法。 额外的替代方法包括基因化对抗网络、 有条件的先前网络, 和监督的单视图重建。 我们用通过在视频中推导出时间相邻框架的类而获得的经验外观来测试这些替代方法。 这些发展只是向评估深网络分类员的可靠性迈出了很小的一步。