In this paper, we consider learning scenarios where the learned model is evaluated under an unknown test distribution which potentially differs from the training distribution (i.e. distribution shift). The learner has access to a family of weight functions such that the test distribution is a reweighting of the training distribution under one of these functions, a setting typically studied under the name of Distributionally Robust Optimization (DRO). We consider the problem of deriving regret bounds in the classical learning theory setting, and require that the resulting regret bounds hold uniformly for all potential test distributions. We show that the DRO formulation does not guarantee uniformly small regret under distribution shift. We instead propose an alternative method called Minimax Regret Optimization (MRO), and show that under suitable conditions this method achieves uniformly low regret across all test distributions. We also adapt our technique to have stronger guarantees when the test distributions are heterogeneous in their similarity to the training data. Given the widespead optimization of worst case risks in current approaches to robust machine learning, we believe that MRO can be a strong alternative to address distribution shift scenarios.
翻译:在本文中,我们考虑了在未知的测试分布下评价学习模式的学习情景,这种测试分布可能与培训分布不同(即分布转移)。学习者可以使用一个重力函数组,这样测试分布就是在其中一种函数下对培训分布进行重新加权,通常以分布式Robust优化(DRO)为名进行研究。我们考虑在传统学习理论中得出遗憾界限的问题,并要求由此得出的遗憾界限对所有潜在的测试分布保持一致。我们表明,DRO配方并不能保证分配式分布式分配下的统一微小遗憾。我们相反地提出了一种称为Minimax Regret优化(MRO)的替代方法,并表明在适当条件下,这种方法在所有测试分布中都实现了一致的低遗憾。我们还调整了我们的技术,以便在测试分布与培训数据相似的情况下产生更强烈的保证。鉴于当前方法对最坏的病例风险的广义优化,我们认为,MRO可以是解决分配变化情景的有力替代方法。