In this paper, we propose CI-VI an efficient and scalable solver for semi-implicit variational inference (SIVI). Our method, first, maps SIVI's evidence lower bound (ELBO) to a form involving a nonlinear functional nesting of expected values and then develops a rigorous optimiser capable of correctly handling bias inherent to nonlinear nested expectations using an extrapolation-smoothing mechanism coupled with gradient sketching. Our theoretical results demonstrate convergence to a stationary point of the ELBO in general non-convex settings typically arising when using deep network models and an order of $O(t^{-\frac{4}{5}})$ gradient-bias-vanishing rate. We believe these results generalise beyond the specific nesting arising from SIVI to other forms. Finally, in a set of experiments, we demonstrate the effectiveness of our algorithm in approximating complex posteriors on various data-sets including those from natural language processing.
翻译:在本文中,我们建议CI-VI为半隐含变异推断(SIVI)提供一个高效且可扩缩的求解器。我们的方法是,首先,将SIVI的证据下限(ELBO)绘制成一种包含预期值的非线性功能性嵌套形式,然后开发一个能够正确处理非线性嵌巢期望所固有的偏见的严格优化器,使用外推移动机制加上梯度素描。我们的理论结果表明,在使用深网络模型和低坡度位位压率的顺序($O(t ⁇ -frac{4 ⁇ 5 ⁇ )时,通常会出现与ELBO的固定点的趋同。我们认为,这些结果超越了SISVI产生的特定嵌套到其他形式的范围。最后,在一系列实验中,我们展示了在各种数据集,包括来自自然语言处理的数据集上与复杂后部相近的算法的有效性。