Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this important question. We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and asymptotic behavior. For example, in the covariate shift setting we see that the two approaches not only arrive at the same asymptotic solution, but also have similar finite sample behavior with no clear winner. For other distribution shifts such as those involving confounders or anti-causal variables, however, the two approaches arrive at different asymptotic solutions where IRM is guaranteed to be close to the desired OOD solutions in the finite sample regime, while ERM is biased even asymptotically. We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions
翻译:最近,有人提议将各种风险的最小化(IRM)作为解决分配(OOOD)一般化问题的有希望的解决办法。然而,目前尚不清楚的是,何时应偏好IRM而不是广泛使用的经验风险最小化(ERM)框架。在这项工作中,我们从抽样复杂性的角度分析这两个框架,从而在回答这一重要问题方面迈出了坚实的一步。我们发现,根据数据生成机制的类型,两种方法的抽样和行为可能差别很大,抽样和零食行为可能非常有限。例如,在共变式变化环境中,我们看到两种方法不仅达成同样的零食解决办法,而且具有类似的有限抽样行为,而且没有明显的赢家。然而,对于其他分配变化,例如涉及连接者或反癌变异变量的分布变化,两种方法都得出了不同的零现解决办法,保证IRM在有限抽样制度中接近预期的OOOD解决办法,而机构风险管理即使没有多少偏向。我们进一步调查了不同的因素 -- 环境数量、模型的复杂性和IMR刑罚重量 -- 影响IRM在距离关系中的IRM的抽样复杂性。