Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatial-temporal modeling where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace approximations (INLA) provides a framework for performing Bayesian inference applicable to a large subclass of additive Bayesian hierarchical models. In combination with the stochastic partial differential equations (SPDE) approach it gives rise to an efficient method for spatial-temporal modeling. In this work we build on the INLA-SPDE approach, by putting forward a performant distributed memory variant, INLA-DIST, for large-scale applications. To perform the arising computational kernel operations, consisting of Cholesky factorizations, solving linear systems, and selected matrix inversions, we present two numerical solver options, a sparse CPU-based library and a novel blocked GPU-accelerated approach which we propose. We leverage the recurring nonzero block structure in the arising precision (inverse covariance) matrices, which allows us to employ dense subroutines within a sparse setting. Both versions of INLA-DIST are highly scalable, capable of performing inference on models with millions of latent parameters. We demonstrate their accuracy and performance on synthetic as well as real-world climate dataset applications.
翻译:贝叶斯推断任务一直是一个计算挑战,尤其是对于高维潜在参数空间普遍存在于时空建模中。集成嵌套拉普拉斯近似(INLA)方法为一类加性贝叶斯分层模型提供了贝叶斯推断框架。与随机偏微分方程(SPDE)方法相结合,为时空建模提供了有效的方法。在这项工作中,我们基于INLA-SPDE方法,提出了一种适用于大规模应用的高效分布式内存变体INLA-DIST。为了执行核心计算操作,包括Cholesky分解,解线性系统和所选矩阵求逆,我们提出了两种数值求解器选项,一种是稀疏CPU库,另一种是我们提出的新型阻塞GPU加速方法。我们利用精度(逆协方差)矩阵中出现的重复非零块结构,使我们能够在稀疏设置中使用密集子程序。INLA-DIST的两个版本都具有高度可扩展性,能够在具有数百万个潜在参数的模型上执行推断。我们展示了它们在合成和真实世界气候数据集应用中的准确性和性能。