In spatial statistics, a common objective is to predict values of a spatial process at unobserved locations by exploiting spatial dependence. Kriging provides the best linear unbiased predictor using covariance functions and is often associated with Gaussian processes. However, when considering non-linear prediction for non-Gaussian and categorical data, the Kriging prediction is no longer optimal, and the associated variance is often overly optimistic. Although deep neural networks (DNNs) are widely used for general classification and prediction, they have not been studied thoroughly for data with spatial dependence. In this work, we propose a novel DNN structure for spatial prediction, where the spatial dependence is captured by adding an embedding layer of spatial coordinates with basis functions. We show in theory and simulation studies that the proposed DeepKriging method has a direct link to Kriging in the Gaussian case, and it has multiple advantages over Kriging for non-Gaussian and non-stationary data, i.e., it provides non-linear predictions and thus has smaller approximation errors, it does not require operations on covariance matrices and thus is scalable for large datasets, and with sufficiently many hidden neurons, it provides the optimal prediction in terms of model capacity. We further explore the possibility of quantifying prediction uncertainties based on density prediction without assuming any data distribution. Finally, we apply the method to predicting PM2.5 concentrations across the continental United States.
翻译:在空间统计中,一个共同目标是通过利用空间依赖性来预测未观测地点的空间过程值。 Kriging 提供了使用共差功能的最佳线性无偏向预测器,并且往往与高斯进程相关。然而,在考虑非高加索和绝对数据的非线性预测时,Kriging 预测已不再是最佳的,相关差异往往过于乐观。虽然深神经网络(DNN)被广泛用于一般分类和预测,但对于具有空间依赖性的数据没有进行彻底研究。在这项工作中,我们提议了一个新的空间预测DNNN结构,其中空间依赖性通过在基础功能中添加一个空间坐标嵌入层来捕捉到空间依赖性。我们在理论和模拟研究中显示,拟议的DeepKriging方法与高斯案中的克里金直接相关,而且与此相关的差异往往过于乐观。尽管深度神经网络(DNNNN)被广泛用于一般分类和不线性数据,即提供非线性预测,因此近似差,但不需要在可变化矩阵矩阵上操作空间依赖空间预测空间依赖,因此,在基础功能功能功能上可以进行最佳预测。我们最终假设以大量预测。