We consider Markov chain Monte Carlo (MCMC) algorithms for Bayesian high-dimensional regression with continuous shrinkage priors. A common challenge with these algorithms is the choice of the number of iterations to perform. This is critical when each iteration is expensive, as is the case when dealing with modern data sets, such as genome-wide association studies with thousands of rows and up to hundred of thousands of columns. We develop coupling techniques tailored to the setting of high-dimensional regression with shrinkage priors, which enable practical, non-asymptotic diagnostics of convergence without relying on traceplots or long-run asymptotics. By establishing geometric drift and minorization conditions for the algorithm under consideration, we prove that the proposed couplings have finite expected meeting time. Focusing on a class of shrinkage priors which includes the 'Horseshoe', we empirically demonstrate the scalability of the proposed couplings. A highlight of our findings is that less than 1000 iterations can be enough for a Gibbs sampler to reach stationarity in a regression on 100,000 covariates. The numerical results also illustrate the impact of the prior on the computational efficiency of the coupling, and suggest the use of priors where the local precisions are Half-t distributed with degree of freedom larger than one.
翻译:我们用连续缩缩前科来考虑巴伊西亚高维回归的Markov连锁蒙特卡洛(MCMC)算法。这些算法的一个共同挑战是选择要执行的迭代次数。当每次迭代费用昂贵时,这一点至关重要,就像处理现代数据集时一样,例如与数千行和多达10万列的基因组连结研究。我们开发了针对高维回归和缩缩缩前科设置的混合技术,这种技术使得不依靠追踪或长期失序的趋同能够进行实际、非不痛苦的趋同诊断。我们通过为审议中的算法设定几何轨流和微小化条件,我们证明提议的合并限制了预期的开会时间。我们侧重于包括“Horseshoe”在内的一系列缩缩缩前科,我们从经验上展示了拟议政变的可缩缩缩缩性。我们发现的一个亮点是,不到1000年的迭代数足以使吉卜采样者在10万个复位或长期失序中达到稳定性。通过为10万个复位性,我们所考虑的测算结果还显示的是,之前的算结果还显示的是,之前的精度计算结果也表明,以前的精度是比地方的精度的精度的精度的缩度的缩度,以前的缩度是前的缩度的缩化结果。前算法分析结果还显示。前的精确度的缩略度的精确度。