Non-linear hierarchical models are commonly used in many disciplines. However, inference in the presence of non-nested effects and on large datasets is challenging and computationally burdensome. This paper provides two contributions to scalable and accurate inference. First, I derive a new mean-field variational algorithm for estimating binomial logistic hierarchical models with an arbitrary number of non-nested random effects. Second, I propose "marginally augmented variational Bayes" (MAVB) that further improves the initial approximation through a step of Bayesian post-processing. I prove that MAVB provides a guaranteed improvement in the approximation quality at low computational cost and induces dependencies that were assumed away by the initial factorization assumptions. I apply these techniques to a study of voter behavior using a high-dimensional application of the popular approach of multilevel regression and post stratification (MRP). Existing estimation took hours whereas the algorithms proposed run in minutes. The posterior means are well-recovered even under strong factorization assumptions. Applying MAVB further improves the approximation by partially correcting the under-estimated variance. The proposed methodology is implemented in an open source software package.
翻译:许多学科通常使用非线性等级模型,然而,在非自发效应和大型数据集面前的推论具有挑战性和计算性负担性。本文为可缩放和准确推算提供了两种贡献。首先,我得出一种新的中位变异算法,用于估算二元论后勤等级模型,其中任意使用大量非线性随机效应。第二,我提议“边际扩大变异性贝贝兹” (MAVB),通过巴耶西亚后处理的一步来进一步改进初始近似。我证明,MAVB保证以低计算成本提高近似质量,并引出最初因因素化假设所假定的依附性。我运用这些技术对选民行为进行研究,采用高维度的多级回归和后分级法(MRP),现有估计需要几个小时,而拟议算法则以分钟进行。即使采用强度系数化假设,也很好地弥补了后方位方法。应用MAVB,通过部分纠正低估的软件软件,进一步改进近似性。拟议方法在源中采用。