Gradient estimation -- approximating the gradient of an expectation with respect to the parameters of a distribution -- is central to the solution of many machine learning problems. However, when the distribution is discrete, most common gradient estimators suffer from excessive variance. To improve the quality of gradient estimation, we introduce a variance reduction technique based on Stein operators for discrete distributions. We then use this technique to build flexible control variates for the REINFORCE leave-one-out estimator. Our control variates can be adapted online to minimize the variance and do not require extra evaluations of the target function. In benchmark generative modeling tasks such as training binary variational autoencoders, our gradient estimator achieves substantially lower variance than state-of-the-art estimators with the same number of function evaluations.
翻译:渐变估计 -- -- 接近分配参数预期值的梯度 -- -- 是解决许多机器学习问题的核心。然而,如果分布是离散的,大多数常见的梯度估计者会受到差异过大的影响。为了提高梯度估计的质量,我们采用基于斯坦因操作员的差异减少技术来进行离散分布。然后,我们用这种技术为REINFORCE请假一出一出估计器建立灵活的控制变量。我们的控制变量可以在网上调整,以尽量减少差异,而不需要对目标函数进行额外评价。在对二进制自动变异器培训等基准基因模型任务中,我们的梯度估测器的变差大大低于具有相同数量功能评价的、最先进的估量器。