This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE). We prove that in the limit of large $K$ (number of importance samples) one can choose the control variate such that the Signal-to-Noise ratio (SNR) of the estimator grows as $\sqrt{K}$. This is in contrast to the standard pathwise gradient estimator where the SNR decreases as $1/\sqrt{K}$. Based on our theoretical findings we develop a novel control variate that extends on VIMCO. Empirically, for the training of both continuous and discrete generative models, the proposed method yields superior variance reduction, resulting in an SNR for IWAE that increases with $K$ without relying on the reparameterization trick. The novel estimator is competitive with state-of-the-art reparameterization-free gradient estimators such as Reweighted Wake-Sleep (RWS) and the thermodynamic variational objective (TVO) when training generative models.
翻译:本文引入了分数函数梯度测算器( IWAE) 。 我们证明, 在大型重度变异约束值的上限( 重要样本数量) 中, 人们可以选择控制变异, 使估量器的信号对噪音比( SNR) 以$sqrt{K}美元增长。 这与标准路径偏差测算器形成对照, 标准路径梯度测算器由SNR以$/ sqrt{K}美元下降。 根据我们的理论发现, 我们开发了一种在VIMCO上延伸的新型控制变异。 随机性, 用于连续和离散基因模型的培训, 拟议的方法产生更大的差异减少, 导致IWAE的SNR在不依赖再计法戏法的情况下以1K美元增长。 新的估计器在培训基因模型时, 具有与州级再加权休克( RWS) 和热力变异目标( TVO) 的竞争力。