Gaussian process regression is used throughout statistics and machine learning for prediction and uncertainty quantification. A Gaussian process is specified by its mean and covariance functions. Many covariance functions, including Mat\'erns, have a smoothness parameter that is notoriously difficult to specify correctly or estimate from the data. In practice, the smoothness parameter is often selected more or less arbitrarily. We introduce rate-unbiasedness, a relaxed notion of asymptotic optimality which requires that the expected ratio of the mean-square error presumed by a potentially misspecified model and the true, but unknown, mean-square error remain bounded away from zero and infinity as more data are obtained. A rate-unbiased model provides uncertainty quantification that is of correct order of magnitude. We then prove that scale estimation suffices for rate-unbiasedness in a variety of common settings. As estimation of the scale of a Gaussian process is routine and requires no optimisation, rate-unbiasedness can be achieved in many applications.
翻译:高斯过程回归在统计学和机器学习中被广泛用于预测和不确定性量化。高斯过程由其均值函数和协方差函数定义。包括Matérn在内的许多协方差函数都具有光滑度参数,该参数的正确设定或从数据中估计是众所周知的难题。实践中,光滑度参数往往或多或少被任意选择。我们提出了速率无偏性这一概念,它是一种放宽的渐近最优性要求,即随着数据量的增加,由可能误设的模型假定的均方误差与真实但未知的均方误差之比的期望值需保持有界且远离零和无穷大。速率无偏模型能提供量级正确的量化不确定性。随后我们证明,在多种常见设定下,尺度估计足以实现速率无偏性。由于高斯过程的尺度估计是常规操作且无需优化,因此在许多应用中均可实现速率无偏性。