模拟分布强力优化中的第二玩家 (Modeling the Second Player in Distributionally Robust Optimization)

Distributionally robust optimization (DRO) provides a framework for training machine learning models that are able to perform well on a collection of related data distributions (the "uncertainty set"). This is done by solving a min-max game: the model is trained to minimize its maximum expected loss among all distributions in the uncertainty set. While careful design of the uncertainty set is critical to the success of the DRO procedure, previous work has been limited to relatively simple alternatives that keep the min-max optimization problem exactly tractable, such as $f$-divergence balls. In this paper, we argue instead for the use of neural generative models to characterize the worst-case distribution, allowing for more flexible and problem-specific selection of the uncertainty set. However, while simple conceptually, this approach poses a number of implementation and optimization challenges. To circumvent these issues, we propose a relaxation of the KL-constrained inner maximization objective that makes the DRO problem more amenable to gradient-based optimization of large scale generative models, and develop model selection heuristics to guide hyper-parameter search. On both toy settings and realistic NLP tasks, we find that the proposed approach yields models that are more robust than comparable baselines.

翻译：稳健分布优化(DRO) 提供了一个框架,用于培训能够很好地收集相关数据分布资料的机器学习模型(“不确定性集” ) 的培训模式。这样做的方法是解决一个微轴游戏:该模型经过训练,以最大限度地减少不确定性组中所有分布中的最大预期损失。虽然仔细设计不确定性组对于DRO程序的成功至关重要,但先前的工作只限于相对简单的替代方法,使微轴优化问题能够精确地处理,例如美元-divegence球。在本文中,我们主张使用神经基因化模型来描述最坏情况分布的特点,从而能够更灵活和针对具体问题地选择不确定性组。然而,虽然在概念上简单,这一方法带来了一些执行和优化方面的挑战。为回避这些问题,我们提议放松KL限制的内部最大化目标,使DRO问题更适于大规模基因化模型的梯度优化,并开发模型选择超临界系统模型,用以指导超常参数的搜索。在最不现实的模式设置和最可靠的NLP方法上,我们发现,我们提出的基准是比较的模型,我们发现比比较的模型具有更高的基准。