Gaussian processes (GPs) are Bayesian non-parametric models useful in a myriad of applications. Despite their popularity, the cost of GP predictions (quadratic storage and cubic complexity with respect to the number of training points) remains a hurdle in applying GPs to large data. We present a fast posterior mean prediction algorithm called FastMuyGPs to address this shortcoming. FastMuyGPs is based upon the MuyGPs hyperparameter estimation algorithm and utilizes a combination of leave-one-out cross-validation, batching, nearest neighbors sparsification, and precomputation to provide scalable, fast GP prediction. We demonstrate several benchmarks wherein FastMuyGPs prediction attains superior accuracy and competitive or superior runtime to both deep neural networks and state-of-the-art scalable GP algorithms.
翻译:Gausian 进程(GPs)是贝耶斯非参数模型,在多种应用中非常有用,尽管这些模型很受欢迎,但GP预测的成本(水层储存和与培训点数有关的立方复杂程度)仍然是将GP应用于大型数据的一个障碍。我们提出了一个称为FastMuyGP的快速后传平均预测算法,称为FastMuyGP,以解决这一缺陷。快速MuyGP基于MuyGPs的超参数估计算法,并使用混合的离子单交叉校准、分批、近邻聚和预估,以提供可缩放的快速GP预测。我们展示了若干基准,即快速MuyGP的预测在深度神经网络和最新可缩放的GP算法上都达到了更高的准确性和竞争性或优越性运行时间。