We propose the use of U-statistics to reduce variance for gradient estimation in importance-weighted variational inference. The key observation is that, given a base gradient estimator that requires $m > 1$ samples and a total of $n > m$ samples to be used for estimation, lower variance is achieved by averaging the base estimator on overlapping batches of size $m$ than disjoint batches, as currently done. We use classical U-statistic theory to analyze the variance reduction, and propose novel approximations with theoretical guarantees to ensure computational efficiency. We find empirically that U-statistic variance reduction can lead to modest to significant improvements in inference performance on a range of models, with little computational cost.
翻译:我们建议使用U-统计来减少按重量加权变差推算得出的梯度估计差异。 关键观察是,鉴于一个基梯度估测器需要1美元 > 1美元的样本和用于估算的总共1美元 > m美元样本,因此,通过按目前的做法平均使用重重体积比不连续体积重的成批量的基估测器来降低差异。 我们使用传统的U-统计理论来分析差异减少,并提出带有理论保证的新近似法以确保计算效率。 我们从经验中发现,用U-统计差异的减少可以使一系列模型的推算性能显著提高,而计算成本则很小。</s>