Probabilistic modeling of multidimensional spatiotemporal data is critical to many real-world applications. However, real-world spatiotemporal data often exhibits complex dependencies that are nonstationary, i.e., correlation structure varies with location/time, and nonseparable, i.e., dependencies exist between space and time. Developing effective and computationally efficient statistical models to accommodate nonstationary/nonseparable processes containing both long-range and short-scale variations becomes a challenging task, especially for large-scale datasets with various corruption/missing structures. In this paper, we propose a new statistical framework -- Bayesian Complementary Kernelized Learning (BCKL) -- to achieve scalable probabilistic modeling for multidimensional spatiotemporal data. To effectively describe complex dependencies, BCKL integrates kernelized low-rank factorization with short-range spatiotemporal Gaussian processes (GP), in which the two components complement each other. Specifically, we use a multi-linear low-rank factorization component to capture the global/long-range correlations in the data and introduce an additive short-scale GP based on compactly supported kernel functions to characterize the remaining local variabilities. We develop an efficient Markov chain Monte Carlo (MCMC) algorithm for model inference and evaluate the proposed BCKL framework on both synthetic and real-world spatiotemporal datasets. Our results confirm the superior performance of BCKL in providing accurate posterior mean and high-quality uncertainty estimates.
翻译:对于许多现实世界应用而言,对多层面的多时数据进行概率模型分析至关重要。然而,现实世界的多时数据往往呈现出非静止的复杂依赖性,即相关结构因地点/时间而异,不可分离,即空间与时间之间存在依赖性。开发有效和计算高效的统计模型,以适应包含长距离和短距离变化的非静止/不可分离的流程,这已成为一项艰巨的任务,特别是对于具有各种腐败/流出结构的大规模数据集而言。在本文件中,我们提议建立一个新的统计框架 -- -- 贝伊西亚的量化不确定性补充学习(BCKL) -- -- 以实现可扩缩的多时/时和不可分离性模型。为了有效地描述复杂的依赖性,CBCKL将低级的低级别因素集成与短距离的模拟高空模型进程(GP)相结合,其中两个组成部分相互补充。具体地说,我们使用一个多线式的低级要素模型评估中,即精度的精度核心学习(BKLKLL) -- -- -- 实现可扩缩的更高级数据,用于根据全球/长级的升级数据,在持续的卡路基数据中提供高级的升级的升级数据中的数据和升级数据。