Nonstationary and non-Gaussian spatial data are prevalent across many fields (e.g., counts of animal species, disease incidences in susceptible regions, and remotely-sensed satellite imagery). Due to modern data collection methods, the size of these datasets have grown considerably. Spatial generalized linear mixed models (SGLMMs) are a flexible class of models used to model nonstationary and non-Gaussian datasets. Despite their utility, SGLMMs can be computationally prohibitive for even moderately large datasets. To circumvent this issue, past studies have embedded nested radial basis function into the SGLMM. However, two crucial specifications (knot locations and bandwidths), which directly affect model performance, are generally fixed prior to model-fitting. We propose a novel algorithm to model large nonstationary and non-Gaussian spatial datasets using adaptive radial basis functions. Our approach: (1) partitions the spatial domain into subregions; (2) selects a carefully curated set of basis knot locations within each partition; and (3) models the latent spatial surface using partition-varying and data-driven (adaptive) basis functions. Through an extensive simulation study, we show that our approach provides more accurate predictions than a competing method while preserving computational efficiency. We also demonstrate our approach on two environmental datasets that feature incidences of a parasitic plant species and counts of bird species in the United States. Our method generalizes to other hierarchical spatial models, and we provide ready-to-use code written in nimble
翻译:在许多领域(例如动物物种数量、易受影响的区域的疾病发生率和遥感卫星图像),非静止和非古塞西地区的空间数据十分普遍。由于现代数据收集方法,这些数据集的规模已大幅扩大。空间通用线性混合模型(SGLMM)是用于模拟非静止和非古塞西数据集的灵活模型类别。尽管这些模型具有实用性,但SGLMMM系统甚至可以计算为中等程度的高度级数据集都难以使用。为绕过这一问题,以往的研究已经将固定的辐射基功能嵌入SGLMM。然而,两个直接影响到模型性能的关键规格(knot位置和带宽)通常在模型安装之前就已经固定。我们提出一个新的算法,用适应性辐射基础功能模拟大型非静止和非伽西文空间数据集。我们的方法:(1)将空间域分割到各次区域;(2)在每个分区内选择一套仔细调整的基础结点位置;(3)用分布式和带带宽度的原始空间表面模型(knotock),在模型上提供我们更精确、更精确、更精确的系统化的系统计算方法上,我们用两种方法来显示我们系统上的一种方法。