Gaussian processes (GPs) are a highly flexible, nonparametric statistical model that are commonly used to fit nonlinear relationships or account for correlation between observations. However, the computational load of fitting a Gaussian process is $\mathcal{O}(n^3)$ making them infeasible for use on large datasets. To make GPs more feasible for large datasets, this research focuses on the use of minibatching to estimate GP parameters. Specifically, we outline both approximate and exact minibatch Markov chain Monte Carlo algorithms that substantially reduce the computation of fitting a GP by only considering small subsets of the data at a time. We demonstrate and compare this methodology using various simulations and real datasets.
翻译:暂无翻译