We present a computationally-efficient strategy to find the hyperparameters of a Gaussian process (GP) avoiding the computation of the likelihood function. The found hyperparameters can then be used directly for regression or passed as initial conditions to maximum-likelihood (ML) training. Motivated by the fact that training a GP via ML is equivalent (on average) to minimising the KL-divergence between the true and learnt model, we set to explore different metrics/divergences among GPs that are computationally inexpensive and provide estimates close to those of ML. In particular, we identify the GP hyperparameters by projecting the empirical covariance or (Fourier) power spectrum onto a parametric family, thus proposing and studying various measures of discrepancy operating on the temporal or frequency domains. Our contribution extends the Variogram method developed by the geostatistics literature and, accordingly, it is referred to as the Generalised Variogram method (GVM). In addition to the theoretical presentation of GVM, we provide experimental validation in terms of accuracy, consistency with ML and computational complexity for different kernels using synthetic and real-world data.
翻译:我们提出了一个计算高效的战略,以寻找高斯进程(GP)的超参数,避免计算概率函数。然后,发现超参数可以直接用于回归,或作为最大相似度(ML)培训的初步条件,用于回归或作为最大相似度(ML)培训的初始条件。受以下事实的驱使,即通过ML培训GP等于(平均)最小化真实模型和所学模型之间的KL差异度量,我们准备探索计算成本低和提供接近ML的估计数的GP不同度量/参数。我们通过预测实验性共变数或(Fourier)电频谱到一个参数组,从而提出和研究在时空或频域操作上的各种差异计量。我们的贡献扩展了地理统计学文献开发的VAL方法,因此,我们称之为通用Varigraphic方法。除了GVM的理论介绍外,我们还用真实数据、与ML的一致性和合成数据的复杂性进行实验性验证。