Gaussian process regression has proven very powerful in statistics, machine learning and inverse problems. A crucial aspect of the success of this methodology, in a wide range of applications to complex and real-world problems, is hierarchical modeling and learning of hyperparameters. The purpose of this paper is to study two paradigms of learning hierarchical parameters: one is from the probabilistic Bayesian perspective, in particular, the empirical Bayes approach that has been largely used in Bayesian statistics; the other is from the deterministic and approximation theoretic view, and in particular the kernel flow algorithm that was proposed recently in the machine learning literature. Analysis of their consistency in the large data limit, as well as explicit identification of their implicit bias in parameter learning, are established in this paper for a Mat\'ern-like model on the torus. A particular technical challenge we overcome is the learning of the regularity parameter in the Mat\'ern-like field, for which consistency results have been very scarce in the spatial statistics literature. Moreover, we conduct extensive numerical experiments beyond the Mat\'ern-like model, comparing the two algorithms further. These experiments demonstrate learning of other hierarchical parameters, such as amplitude and lengthscale; they also illustrate the setting of model misspecification in which the kernel flow approach could show superior performance to the more traditional empirical Bayes approach.
翻译:高斯进程回归在统计、机器学习和反向问题方面证明非常有力。这一方法在一系列复杂和现实世界问题的广泛应用中取得成功的一个关键方面是分级建模和超参数的学习。本文件的目的是研究两个学习等级参数的范式:一个是从贝叶斯概率学的角度,特别是从贝叶西亚统计中广泛使用的实证贝斯方法;另一个是从确定性和近似理论学观点,特别是最近机器学习文献中提议的内核流算法。本文确定了在大数据限度中分析其一致性,以及明确查明其在参数学习中的隐含偏差。本文的目的是研究两个学习等级参数的范式:一个是从贝叶斯论的概率学角度,特别是从贝叶斯统计中广泛使用的经验性贝斯方法;另一个是从空间统计学文献中非常缺乏一致性的结果。此外,我们在Mat\'e-iornium 模型之外进行了广泛的数字实验,比较了大数据限度内两个参数学习的隐含偏差,这些实验还展示了高级演算方法的演进程度。这些实验展示了马特级的演算法,进一步展示了亚学的演进度。这些演算方法。