Spatial count data models are used to explain and predict the frequency of phenomena such as traffic accidents in geographically distinct entities such as census tracts or road segments. These models are typically estimated using Bayesian Markov chain Monte Carlo (MCMC) simulation methods, which, however, are computationally expensive and do not scale well to large datasets. Variational Bayes (VB), a method from machine learning, addresses the shortcomings of MCMC by casting Bayesian estimation as an optimisation problem instead of a simulation problem. Considering all these advantages of VB, a VB method is derived for posterior inference in negative binomial models with unobserved parameter heterogeneity and spatial dependence. P\'olya-Gamma augmentation is used to deal with the non-conjugacy of the negative binomial likelihood and an integrated non-factorised specification of the variational distribution is adopted to capture posterior dependencies. The benefits of the proposed approach are demonstrated in a Monte Carlo study and an empirical application on estimating youth pedestrian injury counts in census tracts of New York City. The VB approach is around 45 to 50 times faster than MCMC on a regular eight-core processor in a simulation and an empirical study, while offering similar estimation and predictive accuracy. Conditional on the availability of computational resources, the embarrassingly parallel architecture of the proposed VB method can be exploited to further accelerate its estimation by up to 20 times.
翻译:使用空间计数数据模型来解释和预测地理上不同实体(如普查道或路段)交通事故等现象的频率,这些模型通常使用Bayesian Markov连锁Monte Carlo(MCMCC)模拟方法进行估算,但是,这些模拟方法在计算上费用昂贵,其规模不及大型数据集。 机器学习的一种方法,即变相贝(VB),通过将Bayesian估计作为一个优化问题而不是模拟问题来解决MCMC的缺点。考虑到VB的所有这些优势,在具有未观测到的参数异质性和空间依赖性的负双向双向模型中,得出VB方法的后向推断。 P\'olya-Gamma加增能用于处理负双向双向可能性的不相容性,而变异分布的综合非约束性规范被采用来捕捉后向依赖性的估算。 拟议的方法的优点可以在蒙特卡洛研究中展示,在估算20个普查道的青年行人伤亡数字的后向应用经验应用。 VB方法在定期进行模拟的精确性估算时大约45倍地展示其核心结构的精确性研究。