The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning--stochastic approximation methods and normalisation layers--and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.
翻译:用于估算模型不确定性的线性拉普尔法在巴耶斯深层学习界再次受到关注。该方法提供了可靠的错误条,并承认模型证据的封闭式表达方式,允许对模型超参数进行可缩放的选择。在这项工作中,我们研究了这一方法背后的假设,特别是结合模型的选择。我们发现,这些假设与一些现成的标准的深层学习-随机近似法和正常化层工具互动不力,并就如何更好地使这一经典方法适应现代环境提出了建议。我们为我们的建议提供理论支持,并在MLP、经典CNN、有和没有正规化层的残余网络、基因自动转换器和变压器方面以经验验证这些建议。