In this paper, we aim at reducing the variance of doubly stochastic optimization, a type of stochastic optimization algorithm that contains two independent sources of randomness: The subsampling of training data and the Monte Carlo estimation of expectations. Such an optimization regime often has the issue of large gradient variance which would lead to a slow rate of convergence. Therefore we propose Dual Control Variate, a new type of control variate capable of reducing gradient variance from both sources jointly. The dual control variate is built upon approximation-based control variates and incremental gradient methods. We show that on doubly stochastic optimization problems, compared with past variance reduction approaches that take only one source of randomness into account, dual control variate leads to a gradient estimator of significant smaller variance and demonstrates superior performance on real-world applications, like generalized linear models with dropout and black-box variational inference.
翻译:在本文中,我们的目标是减少双向随机优化的差异,这是一种含有两个独立的随机来源的随机优化算法:培训数据的子抽样和蒙特卡洛估计预期值。这种优化制度往往存在很大的梯度差异问题,导致趋同速度缓慢。因此,我们提出双重控制Variate,一种能够共同减少两个来源的梯度差异的新型控制变量。双重控制变量以近似控制变量和递增梯度方法为基础。我们表明,与以往只考虑一个随机来源的减少差异方法相比,双重控制变量导致一个差异偏差估计值差异显著较小,并显示在现实世界应用中的优异性,如通用线性模型,有辍学和黑箱变推法。