Pre-trained deep neural networks can be adapted to perform uncertainty estimation by transforming them into Bayesian neural networks via methods such as Laplace approximation (LA) or its linearized form (LLA), among others. To make these methods more tractable, the generalized Gauss-Newton (GGN) approximation is often used. However, due to complex inefficiency difficulties, both LA and LLA rely on further approximations, such as Kronecker-factored or diagonal approximate GGN matrices, which can affect the results. To address these issues, we propose a new method for scaling LLA using a variational sparse Gaussian Process (GP) approximation based on the dual RKHS of GPs. Our method retains the predictive mean of the original model while allowing for efficient stochastic optimization and scalability in both the number of parameters and the size of the training dataset. Moreover, its training cost is independent of the number of training points, improving over previously existing methods. Our preliminary experiments indicate that it outperforms already existing efficient variants of LLA, such as accelerated LLA (ELLA), based on the Nystr\"om approximation.
翻译:受过训练的深层神经网络可以通过诸如Laplace近似(LA)或其线性形式(LLA)等方法将其转换成Bayesian神经网络来进行不确定性的估算。为了使这些方法更便于推广,经常使用通用高斯-牛顿近似(GGN),但是,由于效率低下的复杂困难,La和LLA都依赖进一步的近似(如Kronecker-costed或daligal ambound GGN矩阵),这些近似会影响结果。为了解决这些问题,我们提出了一种新的方法,利用基于GP的双RKHS(GP)进程(GP)来扩大LLA(G)近似(GP),我们的方法保留了原始模型的预测值,同时允许在参数数量和培训数据集大小方面实现高效的随机优化和伸缩。此外,其培训成本独立于培训点的数量,改进以往的方法。我们的初步实验表明,它比LLA(LLA)的现有有效变异,例如加速LLA(ELLA)。</s>