Deep neural networks can be roughly divided into deterministic neural networks and stochastic neural networks.The former is usually trained to achieve a mapping from input space to output space via maximum likelihood estimation for the weights, which leads to deterministic predictions during testing. In this way, a specific weights set is estimated while ignoring any uncertainty that may occur in the proper weight space. The latter introduces randomness into the framework, either by assuming a prior distribution over model parameters (i.e. Bayesian Neural Networks) or including latent variables (i.e. generative models) to explore the contribution of latent variables for model predictions, leading to stochastic predictions during testing. Different from the former that achieves point estimation, the latter aims to estimate the prediction distribution, making it possible to estimate uncertainty, representing model ignorance about its predictions. We claim that conventional deterministic neural network based dense prediction tasks are prone to overfitting, leading to over-confident predictions, which is undesirable for decision making. In this paper, we investigate stochastic neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation. Specifically, we work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework. Due to the close connection between uncertainty estimation and model calibration, we also introduce how uncertainty estimation can be used for deep model calibration to achieve well-calibrated models, namely dense model calibration. Code and data are available at https://github.com/JingZhang617/UncertaintyEstimation.
翻译:深神经网络可以大致地分为确定性神经网络和透析性神经网络。 前者通常经过培训,通过对重量进行最大可能性估计,从输入空间到输出空间进行测绘,从而通过对重量进行最大可能性估计,从而导致在测试期间作出确定性预测。 以这种方式,对特定重量组进行估算,同时忽视在适当重量空间中可能出现的任何不确定性。 后者将随机性引入框架,要么假设先前的分布超过模型参数(例如,Bayesian神经网络),要么包括潜在变量(例如,基因模型),以探索潜在变量对模型预测的贡献,导致在测试期间作出对不确定性的预测。 前者与前者不同,它的目的是在进行点估计时进行确定性预测,后者的目的是估算不确定性,代表模型对其预测的未知性。 我们声称,基于密度预测任务的常规确定性神经网络很容易过度适应,导致模型/透析性预测,对于做出决策来说是不可取的。 在本文中,我们调查二次神经网络和不确定性估算技术,即以精确性预测和可靠方法为基础,我们使用精确的不确定性和不确定性的估算方法。