With a principled representation of uncertainty and closed form posterior updates, Gaussian processes (GPs) are a natural choice for online decision making. However, Gaussian processes typically require at least $\mathcal{O}(n^2)$ computations for $n$ training points, limiting their general applicability. Stochastic variational Gaussian processes (SVGPs) can provide scalable inference for a dataset of fixed size, but are difficult to efficiently condition on new data. We propose online variational conditioning (OVC), a procedure for efficiently conditioning SVGPs in an online setting that does not require re-training through the evidence lower bound with the addition of new data. OVC enables the pairing of SVGPs with advanced look-ahead acquisition functions for black-box optimization, even with non-Gaussian likelihoods. We show OVC provides compelling performance in a range of applications including active learning of malaria incidence, and reinforcement learning on MuJoCo simulated robotic control tasks.
翻译:Gaussian 进程(GPs)是在线决策的自然选择,但是,Gaussian 进程通常至少需要$\mathcal{O}(n ⁇ 2)美元计算,以限制其普遍适用性。Stochactical Gaussian 进程(SVGPs)可以为固定规模的数据集提供可缩放的推论,但很难有效地以新数据为条件。我们提议了在线变异调节(OVC)程序,这是在不要求通过较低范围的证据进行再培训,而添加新数据的一种程序。OVC使SVGPs与先进的外观获取功能配对,用于黑盒优化,即使非Gaussian的可能性。我们展示了OVC在一系列应用中提供令人信服的性表现,包括积极学习疟疾发病率和在 MuJoco 模拟机器人控制任务上加强学习。