Gaussian processes (GPs) are an important tool in machine learning and statistics with applications ranging from social and natural science through engineering. They constitute a powerful kernelized non-parametric method with well-calibrated uncertainty estimates, however, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computational complexity. For this reason, many sparse GPs techniques have been developed over the past years. In this paper, we focus on GP regression tasks and propose a new approach based on aggregating predictions from several local and correlated experts. Thereby, the degree of correlation between the experts can vary between independent up to fully correlated experts. The individual predictions of the experts are aggregated taking into account their correlation resulting in consistent uncertainty estimates. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. The presented framework can deal with a general kernel function and multiple variables, and has a time and space complexity which is linear in the number of experts and data samples, which makes our approach highly scalable. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximation methods for synthetic as well as several real-world datasets with deterministic and stochastic optimization.
翻译:高斯进程(GPs)是机器学习和统计的重要工具,其应用范围从社会和自然科学到工程学,是机器学习和统计的一个重要工具,是社会科学和自然科学等各种应用的重要工具,它们构成了一种强大的内嵌式非参数方法,具有精确的不确定性估计,然而,现成的GP推论程序仅限于具有数千个数据点的数据集,因为其分数复杂。为此,过去几年来发展了许多稀疏的GP技术。在本文中,我们侧重于GP回归任务,并根据若干当地专家和相关专家的预测,提出一种新的方法。因此,专家与完全相关的专家之间的相互关系可能有所不同。专家的个人预测是结合其相关性的,同时考虑到其关联性导致一致的不确定性估计。我们的方法在有限的情况下恢复了独立的专家产品、少许的GP和全部的GP。 所提出的框架可以处理一般的内核功能和多种变量,在专家和数据样本数量上具有直线性的时间和空间复杂性,这使得我们的方法具有高度的可伸缩性。我们用的方法展示了优异性性的表现,在时间和最佳的精确度上,我们的方法中可以确定我们作为全球最佳的精确度。