Among generalized additive models, additive Mat\'ern Gaussian Processes (GPs) are one of the most popular for scalable high-dimensional problems. Thanks to their additive structure and stochastic differential equation representation, back-fitting-based algorithms can reduce the time complexity of computing the posterior mean from $O(n^3)$ to $O(n\log n)$ time where $n$ is the data size. However, generalizing these algorithms to efficiently compute the posterior variance and maximum log-likelihood remains an open problem. In this study, we demonstrate that for Additive Mat\'ern GPs, not only the posterior mean, but also the posterior variance, log-likelihood, and gradient of these three functions can be represented by formulas involving only sparse matrices and sparse vectors. We show how to use these sparse formulas to generalize back-fitting-based algorithms to efficiently compute the posterior mean, posterior variance, log-likelihood, and gradient of these three functions for additive GPs, all in $O(n \log n)$ time. We apply our algorithms to Bayesian optimization and propose efficient algorithms for posterior updates, hyperparameters learning, and computations of the acquisition function and its gradient in Bayesian optimization. Given the posterior, our algorithms significantly reduce the time complexity of computing the acquisition function and its gradient from $O(n^2)$ to $O(\log n)$ for general learning rate, and even to $O(1)$ for small learning rate.
翻译:在广义加性模型中,加性Matern高斯过程是可扩展的高维问题中最受欢迎的之一。由于其加性结构和随机微分方程表示,基于反向拟合算法可以将计算后验均值的时间复杂度从$O(n^3)$降低到$O (n\log n)$,其中$n$是数据量。然而,将这些算法推广到高效计算后验方差和最大对数似然仍然是一个开放的问题。在这项研究中,我们证明了对于加性Matern高斯过程,不仅可以使用仅涉及稀疏矩阵和稀疏向量的公式表示后验均值,还可以使用这些稀疏公式表示后验方差、对数似然和这三个函数的梯度。我们展示了如何使用这些稀疏公式来将反向拟合算法推广到加性高斯过程,以便在$O(n \log n)$的时间内高效地计算后验均值、后验方差、对数似然和这三个函数的梯度。我们将我们的算法应用到贝叶斯优化中,并提出了高效的后验更新算法,超参数学习和贝叶斯优化中获取函数及其梯度的计算。在给定后验的情况下,我们的算法显着将计算获取函数及其梯度的时间复杂度从$O (n^2)$降低到$O(\log n)$,适用于一般的学习率,甚至适用于小学习率的情况下将时间复杂度降低到$O(1)$。