Although neural networks are powerful function approximators, the underlying modelling assumptions ultimately define the likelihood and thus the hypothesis class they are parameterizing. In classification, these assumptions are minimal as the commonly employed softmax is capable of representing any categorical distribution. In regression, however, restrictive assumptions on the type of continuous distribution to be realized are typically placed, like the dominant choice of training via mean-squared error and its underlying Gaussianity assumption. Recently, modelling advances allow to be agnostic to the type of continuous distribution to be modelled, granting regression the flexibility of classification models. While past studies stress the benefit of such flexible regression models in terms of performance, here we study the effect of the model choice on uncertainty estimation. We highlight that under model misspecification, aleatoric uncertainty is not properly captured, and that a Bayesian treatment of a misspecified model leads to unreliable epistemic uncertainty estimates. Overall, our study provides an overview on how modelling choices in regression may influence uncertainty estimation and thus any downstream decision making process.
翻译:虽然神经网络是强大的功能近似器,但基本建模假设最终决定了可能性,从而也决定了它们所比较的假设类别。在分类方面,这些假设是微不足道的,因为通常使用的软模能够代表任何绝对分布。然而,在回归方面,通常会对要实现的连续分布类型设定限制性的假设,如通过平均差错进行的主要培训选择及其基本高斯假设。最近,建模进展允许对要模拟的连续分布类型进行认知,给予分类模型的灵活性。虽然以往的研究强调这种灵活的回归模型在绩效方面的效益,但我们在这里研究模型选择对不确定性估计的影响。我们强调,在模型的错误区分下,单数不确定性没有得到正确捕捉,而且对错误划定模型的贝耶斯处理会导致不可靠的认知不确定性估计。总体而言,我们的研究概述了回归模型选择如何影响不确定性估计,从而影响任何下游决策过程。