Gaussian processes (GPs) provide a framework for Bayesian inference that can offer principled uncertainty estimates for a large range of problems. For example, if we consider regression problems with Gaussian likelihoods, a GP model enjoys a posterior in closed form. However, identifying the posterior GP scales cubically with the number of training examples and requires to store all examples in memory. In order to overcome these obstacles, sparse GPs have been proposed that approximate the true posterior GP with pseudo-training examples. Importantly, the number of pseudo-training examples is user-defined and enables control over computational and memory complexity. In the general case, sparse GPs do not enjoy closed-form solutions and one has to resort to approximate inference. In this context, a convenient choice for approximate inference is variational inference (VI), where the problem of Bayesian inference is cast as an optimization problem -- namely, to maximize a lower bound of the log marginal likelihood. This paves the way for a powerful and versatile framework, where pseudo-training examples are treated as optimization arguments of the approximate posterior that are jointly identified together with hyperparameters of the generative model (i.e. prior and likelihood). The framework can naturally handle a wide scope of supervised learning problems, ranging from regression with heteroscedastic and non-Gaussian likelihoods to classification problems with discrete labels, but also multilabel problems. The purpose of this tutorial is to provide access to the basic matter for readers without prior knowledge in both GPs and VI. A proper exposition to the subject enables also access to more recent advances (like importance-weighted VI as well as inderdomain, multioutput and deep GPs) that can serve as an inspiration for new research ideas.
翻译:高斯进程( GPs) 为 Bayesian 推算提供了一个框架, 它可以为大量问题提供有原则的不确定性估计。 例如, 如果我们考虑高斯概率的回归问题, 一个 GP 模型可以使用封闭形式的后端。 但是, 将后端的 GP 比例与培训示例数进行对比, 并需要将所有实例存储在记忆中。 为了克服这些障碍, 已经建议了少许的 GP 将真实的后端GP 与假的多面值示例相近。 重要的是, 假培训范例的数量是用户定义的, 并且能够控制计算和记忆的复杂性。 在一般情况下, 稀有的 GP 模型并不享有封闭式的解决方案解决方案解决方案解决方案, 并且不得不采用近距离推论的形式。 在这方面, 选择大致推论的方便选择是变异度( VI), 巴耶斯的推论问题被描绘成一个优化问题 -- 即尽可能降低对日志的误差值, 而作为新的可能性。 这为一个强大和多面的框架提供了一种途径,, 将假培训示例示例示例示例作为 的模型的模型被处理的不精度 。