Seeking to improve model generalization, we consider a new approach based on distributionally robust learning (DRL) that applies stochastic gradient descent to the outer minimization problem. Our algorithm efficiently estimates the gradient of the inner maximization problem through multi-level Monte Carlo randomization. Leveraging theoretical results that shed light on why standard gradient estimators fail, we establish the optimal parameterization of the gradient estimators of our approach that balances a fundamental tradeoff between computation time and statistical variance. Numerical experiments demonstrate that our DRL approach yields significant benefits over previous work.
翻译:为了改进模型的概括化,我们考虑一种基于分布式强强学习(DRL)的新方法,该方法将随机梯度梯度下降应用到外部最小化问题。我们的算法通过多层次的蒙特卡洛随机化有效估计了内层最大化问题的梯度。利用理论结果来说明标准梯度估计器为何失败,我们建立了我们计算时间和统计差异之间基本平衡的梯度估计器的最佳参数。数字实验表明,我们的DRL方法比以前的工作产生了巨大的效益。