In this era of big data, all scientific disciplines are evolving fast to cope up with the enormity of the available information. So is statistics, the queen of science. Big data are particularly relevant to spatio-temporal statistics, thanks to much-improved technology in satellite based remote sensing and Geographical Information Systems. However, none of the existing approaches seem to meet the simultaneous demand of reality emulation and cheap computation. In this article, with the Levy random fields as the starting point, e construct a new Bayesian nonparametric, nonstationary and nonseparable dynamic spatio-temporal model with the additional realistic property that the lagged spatio-temporal correlations converge to zero as the lag tends to infinity. Although our Bayesian model seems to be intricately structured and is variable-dimensional with respect to each time index, we are able to devise a fast and efficient parallel Markov Chain Monte Carlo (MCMC) algorithm for Bayesian inference. Our simulation experiment brings out quite encouraging performance from our Bayesian Levy-dynamic approach. We finally apply our Bayesian Levy-dynamic model and methods to a sea surface temperature dataset consisting of 139,300 data points in space and time. Although not big data in the true sense, this is a large and highly structured data by any standard. Even for this large and complex data, our parallel MCMC algorithm, implemented on 80 processors, generated 110,000 MCMC realizations from the Levy-dynamic posterior within a single day, and the resultant Bayesian posterior predictive analysis turned out to be encouraging. Thus, it is not unreasonable to expect that with significantly more computing resources, it is feasible to analyse terabytes of spatio-temporal data with our new model and methods.
翻译:暂无翻译