We propose a novel approach to perform approximate Bayesian inference in complex models such as Bayesian neural networks. The approach is more scalable to large data than Markov Chain Monte Carlo, it embraces more expressive models than Variational Inference, and it does not rely on adversarial training (or density ratio estimation). We adopt the recent approach of constructing two models: (1) a primary model, tasked with performing regression or classification; and (2) a secondary, expressive (e.g. implicit) model that defines an approximate posterior distribution over the parameters of the primary model. However, we optimise the parameters of the posterior model via gradient descent according to a Monte Carlo estimate of the posterior predictive distribution -- which is our only approximation (other than the posterior model). Only a likelihood needs to be specified, which can take various forms such as loss functions and synthetic likelihoods, thus providing a form of a likelihood-free approach. Furthermore, we formulate the approach such that the posterior samples can either be independent of, or conditionally dependent upon the inputs to the primary model. The latter approach is shown to be capable of increasing the apparent complexity of the primary model. We see this being useful in applications such as surrogate and physics-based models. To promote how the Bayesian paradigm offers more than just uncertainty quantification, we demonstrate: uncertainty quantification, multi-modality, as well as an application with a recent deep forecasting neural network architecture.
翻译:我们提出一种新的方法,在贝叶西亚神经网络等复杂模型中进行近似贝雅人的推断。该方法比马尔科夫链子蒙特蒙特卡洛(Markov Chain Conter Monte Carlo)更适合大数据,它包含比变异推断更清晰的模型,它不依赖对抗性培训(或密度比率估计 ) 。我们采用最近的方法,构建两种模型:(1) 初级模型,负责进行回归或分类;(2) 二级、直言式(例如隐含的)模型,界定比主要模型参数的近似后表层分布。然而,我们根据蒙特卡洛(Monte Carlo)对远地点预测性分布的估计,通过梯度下降,优化后层模型的参数。 后者是我们唯一的近似性(而非后方模型 ) 。 我们只需说明一种可能性,它可以采取多种形式,例如损失功能和合成可能性,从而提供一种无可能性的方法。 此外,我们制定了一种方法,即远端预测性样本可以独立于或有条件地依赖主要模型的投入。 后一种方法,我们所展示的是基础模型的精确性应用的方式,即我们所展示的精确性模型是如何展示的复杂程度。