We formulate natural gradient variational inference (VI), expectation propagation (EP), and posterior linearisation (PL) as extensions of Newton's method for optimising the parameters of a Bayesian posterior distribution. This viewpoint explicitly casts inference algorithms under the framework of numerical optimisation. We show that common approximations to Newton's method from the optimisation literature, namely Gauss-Newton and quasi-Newton methods (e.g., the BFGS algorithm), are still valid under this `Bayes-Newton' framework. This leads to a suite of novel algorithms which are guaranteed to result in positive semi-definite covariance matrices, unlike standard VI and EP. Our unifying viewpoint provides new insights into the connections between various inference schemes. All the presented methods apply to any model with a Gaussian prior and non-conjugate likelihood, which we demonstrate with (sparse) Gaussian processes and state space models.
翻译:我们将自然梯度变异推断(VI),预期传播(EP)和后线线化(PL)作为牛顿优化巴伊西亚后方分布参数的方法的延伸。 这个观点在数字优化的框架内明确给出了推论算法。 我们从优化文献(即高斯-纽顿和准纽顿方法(例如BFGS算法))中显示,牛顿方法的共同近似值在“巴耶-牛顿”框架内仍然有效。 这导致一套新奇的算法,保证产生积极的半定式共变量矩阵,与标准六和EP不同。我们的统一观点为各种推论方案之间的联系提供了新的洞察力。我们提出的所有方法都适用于任何具有高斯先前和非关键可能性的模型,我们用(粗)高斯进程和州空间模型演示了这些模型。