Markov chain Monte Carlo (MCMC) provides asymptotically consistent estimates of intractable posterior expectations as the number of iterations tends to infinity. However, in large data applications, MCMC can be computationally expensive per iteration. This has catalyzed interest in sampling methods such as approximate MCMC, which trade off asymptotic consistency for improved computational speed. In this article, we propose estimators based on couplings of Markov chains to assess the quality of such asymptotically biased sampling methods. The estimators give empirical upper bounds of the Wassertein distance between the limiting distribution of the asymptotically biased sampling method and the original target distribution of interest. We establish theoretical guarantees for our upper bounds and show that our estimators can remain effective in high dimensions. We apply our quality measures to stochastic gradient MCMC, variational Bayes, and Laplace approximations for tall data and to approximate MCMC for Bayesian logistic regression in 4500 dimensions and Bayesian linear regression in 50000 dimensions.
翻译:Markov Mark 链 Monte Carlo(MCMC) 提供了对难以解决的后视期望的简单一致的估计,因为迭代次数往往不尽相同。但是,在大量的数据应用中,MCMC可以按迭代计算昂贵的费用。这催生了对抽样方法的兴趣,如大约MCMC, 即以无现代一致性交换提高计算速度。在本篇文章中,我们提议基于Markov 链的混合测算器,以评估诸如随机偏差的采样方法的质量。估计器给出了瓦塞丁在限制无现代偏差采样方法的分布和原始利益目标分布之间的实验性高度距离。我们为我们的上界建立了理论保障,并表明我们的估测算器仍然能高维量地发挥作用。我们用质量衡量标准来评估Stochaticic MC、变异波湾和高位数据的拉比差近值,并用大约MC,用于4500维特维的Bayes级物流回归和5000维耶斯线回归。