实际稀疏变分高斯过程 (Actually Sparse Variational Gaussian Processes)

Gaussian processes (GPs) are typically criticised for their unfavourable scaling in both computational and memory requirements. For large datasets, sparse GPs reduce these demands by conditioning on a small set of inducing variables designed to summarise the data. In practice however, for large datasets requiring many inducing variables, such as low-lengthscale spatial data, even sparse GPs can become computationally expensive, limited by the number of inducing variables one can use. In this work, we propose a new class of inter-domain variational GP, constructed by projecting a GP onto a set of compactly supported B-spline basis functions. The key benefit of our approach is that the compact support of the B-spline basis functions admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint. This allows us to very efficiently model fast-varying spatial phenomena with tens of thousands of inducing variables, where previous approaches failed.

翻译：高斯过程(GPs)通常因其在计算和内存需求方面不利而受到批评。对于大型数据集，稀疏GPs通过对一组用于概括数据的感应变量进行条件化来减少这些需求。然而，在实践中，对于需要许多感应变量(例如低长度比空间数据)的大型数据集，即使是稀疏的GPs也可能变得计算昂贵，受制于一个人可以使用的感应变量的数量。在这项工作中，我们提出了一类新的跨域变分GP，通过将GP投影到一组紧支B样条基函数上构造。我们方法的关键好处是，B样条基函数的紧支持允许使用稀疏线性代数来显着加快矩阵操作并大大降低内存占用。这使得我们能够非常高效地对数千个感应变量进行建模，从而成功地对快速变化的空间现象进行建模，而先前的方法则失败了。

相关内容

高斯过程

关注 6

高斯过程（Gaussian Process, GP）是概率论和数理统计中随机过程（stochastic process）的一种，是一系列服从正态分布的随机变量（random variable）在一指数集（index set）内的组合。高斯过程中任意随机变量的线性组合都服从正态分布，每个有限维分布都是联合正态分布，且其本身在连续指数集上的概率密度函数即是所有随机变量的高斯测度，因此被视为联合正态分布的无限维广义延伸。高斯过程由其数学期望和协方差函数完全决定，并继承了正态分布的诸多性质

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日