We revisit the theory of importance weighted variational inference (IWVI), a promising strategy for learning latent variable models. IWVI uses new variational bounds, known as Monte Carlo objectives (MCOs), obtained by replacing intractable integrals by Monte Carlo estimates -- usually simply obtained via importance sampling. Burda, Grosse and Salakhutdinov (2016) showed that increasing the number of importance samples provably tightens the gap between the bound and the likelihood. Inspired by this simple monotonicity theorem, we present a series of nonasymptotic results that link properties of Monte Carlo estimates to tightness of MCOs. We challenge the rationale that smaller Monte Carlo variance leads to better bounds. We confirm theoretically the empirical findings of several recent papers by showing that, in a precise sense, negative correlation reduces the variational gap. We also generalise the original monotonicity theorem by considering non-uniform weights. We discuss several practical consequences of our theoretical results. Our work borrows many ideas and results from the theory of stochastic orders.
翻译:我们重新审视了重要性加权变异推断(IWVI)理论,这是一个学习潜在变异模型的有希望的战略。 IWVI使用新的变差界限,称为蒙特卡洛目标(MCOs),用蒙特卡洛估计(MCOs)取代棘手的积分 -- -- 通常是通过重要取样获得的。 Burda、Grosse和Salakhutdinov(IWVI)表明,增加重要样品的数量可以明显地拉近约束与可能性之间的差距。在这种简单的单调理论的启发下,我们提出了一系列非消化结果,将蒙特卡洛估计的特性与最大COs的紧凑性联系起来。我们质疑缩小蒙特卡洛差异导致更好界限的理由。我们从理论上证实了最近几份论文的经验结论,从准确意义上讲,负相关关系缩小了差异。我们还通过考虑非单调权重来概括了原有的单一性。我们讨论了我们理论结果的一些实际后果。我们的工作借用了许多想法和结果,这些理论理论理论。