We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. (2019) can fail to capture "natural" invariances, at least when used in its practical "linear" form, and even on very simple problems which directly follow the motivating examples for IRM. This can lead to worse generalization on new environments, even when compared to unconstrained ERM. The issue stems from a significant gap between the linear variant (as in their concrete method IRMv1) and the full non-linear IRM formulation. Additionally, even when capturing the "right" invariances, we show that it is possible for IRM to learn a sub-optimal predictor, due to the loss function not being invariant across environments. The issues arise even when measuring invariance on the population distributions, but are exacerbated by the fact that IRM is extremely fragile to sampling.
翻译:我们表明,Arjovsky等人(2019年)的“内在风险最小化”(IRM)配方可能无法捕捉到“自然”变异,至少在实际的“线性”形式中使用时,甚至直接针对直接跟随IRM激励范例的非常简单的问题。这可能导致对新环境的更糟糕的概括化,即使与未受约束的ERM相比也是如此。问题的根源在于线性变异(如其具体方法IMMv1)与完全的非线性IRM配方之间的巨大差距。此外,即使捕捉“右”变异,我们也表明IRM有可能学习一个亚最佳的预测器,因为损失功能并不是在环境中都无法变化的。即使在衡量人口分布的变异性时,也会出现问题,但是由于IRM极易被取样而加剧。