Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce ``deterministic ADVI'' (DADVI) to address these issues. DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation, a technique known in the stochastic optimization literature as the ``sample average approximation'' (SAA). By optimizing an approximate but deterministic objective, DADVI can use off-the-shelf second-order optimization, and, unlike standard mean-field ADVI, is amenable to more accurate posterior linear response (LR) covariance estimates. In contrast to existing worst-case theory, we show that, on certain classes of common statistical problems, DADVI and the SAA can perform well with relatively few samples even in very high dimensions, though we also show that such favorable results cannot extend to variational approximations that are too expressive relative to mean-field ADVI. We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.
翻译:----
确定目标的黑匣子变分推断:更快,更准确,更黑匣子
自动微分变分推断(ADVI)提供了多种现代概率编程语言中快速易用的后验概率逼近方法。然而,其随机优化器缺乏明确的收敛标准,需要调整一些参数。此外,ADVIs遗传了均值场变分贝叶斯(MFVB)的较差后验不确定度估计。我们介绍了“确定性ADVI”(DADVI)来解决这些问题。DADVI用一个固定的蒙特卡罗逼近代替难以计算的MFVB目标,这种技术在随机优化文献中被称为“样本平均逼近”(SAA)。通过优化近似但确定的目标,DADVI可以使用现成的二阶优化,而且与标准均值场ADVI不同,可以将更准确的后验线性响应(LR)方差估计纳入其中。与现有的最坏情况理论不同的是,我们展示了在某些常见的统计问题的类别上,DADVI和SAA甚至在高维时候也可以用相对少的样本表现良好,虽然我们也展示了这样的好结果不能向相对于均值场ADVI过于自由的变分逼近扩展。我们在各种真实世界的问题上展示了DADVI可以在默认设置下可靠地找到好的解决方案(与ADVI不同),并且与LR协方差一起通常比标准ADVI更快更准确。