Gaussian processes (GPs) are typically criticised for their unfavourable scaling in both computational and memory requirements. For large datasets, sparse GPs reduce these demands by conditioning on a small set of inducing variables designed to summarise the data. In practice however, for large datasets requiring many inducing variables, such as low-lengthscale spatial data, even sparse GPs can become computationally expensive, limited by the number of inducing variables one can use. In this work, we propose a new class of inter-domain variational GP, constructed by projecting a GP onto a set of compactly supported B-spline basis functions. The key benefit of our approach is that the compact support of the B-spline basis functions admits the use of sparse linear algebra to significantly speed up matrix operations and drastically reduce the memory footprint. This allows us to very efficiently model fast-varying spatial phenomena with tens of thousands of inducing variables, where previous approaches failed.
翻译:高斯过程(GPs)通常因其在计算和内存需求方面不利而受到批评。对于大型数据集,稀疏GPs通过对一组用于概括数据的感应变量进行条件化来减少这些需求。然而,在实践中,对于需要许多感应变量(例如低长度比空间数据)的大型数据集,即使是稀疏的GPs也可能变得计算昂贵,受制于一个人可以使用的感应变量的数量。在这项工作中,我们提出了一类新的跨域变分GP,通过将GP投影到一组紧支B样条基函数上构造。我们方法的关键好处是,B样条基函数的紧支持允许使用稀疏线性代数来显着加快矩阵操作并大大降低内存占用。这使得我们能够非常高效地对数千个感应变量进行建模,从而成功地对快速变化的空间现象进行建模,而先前的方法则失败了。