Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders. Doubly-reparameterized gradients (DReGs) improve on the RT for multi-sample variational bounds by applying reparameterization a second time for an additional reduction in variance. Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. First, we extend the estimator to hierarchical models with several stochastic layers by showing how to treat additional score function terms due to the hierarchical variational posterior. We then generalize DReGs to score functions of arbitrary distributions instead of just those of the sampling distribution, which makes the estimator applicable to the parameters of the prior in addition to those of the posterior.
翻译:由重新校准法( RT) 促成的高效低差梯度估计对于变异自动电解器的成功至关重要。 调试梯度( DREGs) 通过第二次采用重新校准法来进一步缩小差异, 使复选梯度( RT) 在 RT 上改进多样样式变差界限。 在这里, 我们开发了 DReGs 估测器的两种概括性, 并显示它们可以更有效地用于在图像建模任务上培训有条件和等级VAEs 。 首先, 我们将估测器推广到多层的等级模型, 展示如何处理因等级变异后台而增加的评分函数。 我们然后将 DREGs 推广到任意分布函数的计分, 而不是仅仅将抽样分布的函数计分, 使估计器除后台值外还适用于先前参数。