Non-linear hierarchical models are commonly used in many disciplines. However, inference in the presence of non-nested effects and on large datasets is challenging and computationally burdensome. This paper provides two contributions to scalable and accurate inference. First, I derive a new mean-field variational algorithm for estimating binomial logistic hierarchical models with an arbitrary number of non-nested random effects. Second, I propose "marginally augmented variational Bayes" (MAVB) that further improves the initial approximation through a step of Bayesian post-processing. I prove that MAVB provides a guaranteed improvement in the approximation quality at low computational cost and induces dependencies that were assumed away by the initial factorization assumptions. I apply these techniques to a study of voter behavior using a high-dimensional application of the popular approach of multilevel regression and post-stratification (MRP). Existing estimation took hours whereas the algorithms proposed run in minutes. The posterior means are well-recovered even under strong factorization assumptions. Applying MAVB further improves the approximation by partially correcting the under-estimated variance. The proposed methodology is implemented in an open source software package.
翻译:许多学科通常使用非线性等级模型,然而,在非自发效应和大型数据集面前的推论具有挑战性和计算性负担性。本文为可缩放和准确推算提供了两种贡献。首先,我得出一种新的平均场变异算法,用于估算二元论后勤等级模型,其中任意使用大量非自发随机效应。第二,我提议“边际扩大变异性贝贝兹”(MAVB),通过巴耶西亚后处理步骤进一步改进初始近似。我证明,MAVB保证以低计算成本提高近似质量,并引出最初因因素化假设所假设的依附性。我将这些技术应用于使用多层次回归和批准后流行方法的高度应用选民行为研究。现有估计需要几个小时,而所提议的算法则在几分钟内运行。即使根据强度系数化假设,后方位方法也得到了很好的恢复。应用MAVB进一步提高近似性,部分纠正了低估的软件。拟议方法在源中采用。