In the Big Data era, with the ubiquity of geolocation sensors in particular, massive datasets exhibiting a possibly complex spatial dependence structure are becoming increasingly available. In this context, the standard probabilistic theory of statistical learning does not apply directly and guarantees of the generalization capacity of predictive rules learned from such data are left to establish. We analyze here the simple Kriging task from a statistical learning perspective, i.e. by carrying out a nonparametric finite-sample predictive analysis. Given $d\geq 1$ values taken by a realization of a square integrable random field $X=\{X_s\}_{s\in S}$, $S\subset \mathbb{R}^2$, with unknown covariance structure, at sites $s_1,\; \ldots,\; s_d$ in $S$, the goal is to predict the unknown values it takes at any other location $s\in S$ with minimum quadratic risk. The prediction rule being derived from a training spatial dataset: a single realization $X'$ of $X$, independent from those to be predicted, observed at $n\geq 1$ locations $\sigma_1,\; \ldots,\; \sigma_n$ in $S$. Despite the connection of this minimization problem with kernel ridge regression, establishing the generalization capacity of empirical risk minimizers is far from straightforward, due to the non independent and identically distributed nature of the training data $X'_{\sigma_1},\; \ldots,\; X'_{\sigma_n}$ involved in the learning procedure. In this article, non-asymptotic bounds of order $O_{\mathbb{P}}(1/\sqrt{n})$ are proved for the excess risk of a plug-in predictive rule mimicking the true minimizer in the case of isotropic stationary Gaussian processes, observed at locations forming a regular grid in the learning stage. These theoretical results are illustrated by various numerical experiments, on simulated data and on real-world datasets.
翻译:在大数据时代, 特别是地理定位传感器的无处不在, 巨大的数据集正日益呈现出一个可能复杂的空间依赖结构。 在这方面, 统计学习的标准概率理论并不直接适用, 保证从这些数据中学习的预测规则的概括性能力有待于建立。 我们从统计学习的角度分析简单的克里格任务, 也就是说, 进行非参数的定点预测分析。 以美元计算1美元为单位, 直径的直径数据集正在逐渐显现。 直径的平方平方平方平方平方平方平方平方平方平方平方平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面平面, 平面平面平面平面平面平面平面平面平面平面平面平面, 平面平面平面平面平面平面平面平面, 平面平面平面平面平面平面平面平面平面, 平面平面平面平面平面, 平面平面平面平面平面平面, 平面平面平面,平面,平面平面平面,平面,平面平面平面,平面,平面,平面平面平面平面,平面,平面,平面平面,平面,平面,平面,平面平面,平面,平面,平面,平面,平面平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面,平面</s>