Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization. It is well known that a serious downside for kernel-based models is the high computational cost; given a dataset of $n$ samples, the cost grows as $\mathcal{O}(n^3)$. Existing sparse approximation methods can yield a significant reduction in the computational cost, effectively reducing the real world cost down to as low as $\mathcal{O}(n)$ in certain cases. Despite this remarkable empirical success, significant gaps remain in the existing results for the analytical confidence bounds on the error due to approximation. In this work, we provide novel confidence intervals for the Nystr\"om method and the sparse variational Gaussian processes approximation method. Our confidence intervals lead to improved error bounds in both regression and optimization. We establish these confidence intervals using novel interpretations of the approximate (surrogate) posterior variance of the models.
翻译:以内核为基础的模型,如内核脊回归和高森进程,在用于回归和优化的机器学习应用中无处不在。众所周知,内核模型的严重下行是高计算成本;如果数据集为零美元样本,成本将增长为$mathcal{O}(n)3美元。现有的稀有近似方法可以大幅降低计算成本,有效地将实际世界成本降至低至$mathcal{O}(n)美元。尽管取得了这一显著的成功经验,但现有分析结果对于近似错误的分析信心约束方面仍然存在巨大差距。在这项工作中,我们为Nystr\'om方法和稀有变异性高斯进程近似方法提供了新的信任间隔。我们的信心间隔可以改善回归和优化的误差。我们利用对模型近似(溢出)后端差异的新解释来建立这些信任间隔。