Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders. Doubly-reparameterized gradients (DReGs) improve on the RT for multi-sample variational bounds by applying reparameterization a second time for an additional reduction in variance. Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. We first extend the estimator to hierarchical models with several stochastic layers by showing how to treat additional score function terms due to the hierarchical variational posterior. We then generalize DReGs to score functions of arbitrary distributions instead of just those of the sampling distribution, which makes the estimator applicable to the parameters of the prior in addition to those of the posterior.
翻译:由再校准法( RT) 促成的高效低差梯度估计对于变异自动电解器的成功至关重要。 多模数梯度( DREGs) 通过第二次再应用再校准法以进一步缩小差异来改进多模数变差界限的 RT 。 在这里, 我们开发了 DREGs 估计器的两套概括, 并显示它们可以更有效地用于在图像建模任务上培训有条件和等级VAEs 。 我们首先将估计器扩展至具有多个随机层的等级模型, 展示如何处理因等级变异后部而增加的评分功能。 我们然后将 DREGs 推广到任意分布函数的计分, 而不是仅仅将抽样分布的函数计分, 这使得估计器除了后部外, 适用于先前参数 。