Invariant Causal Prediction (Peters et al., 2016) is a technique for out-of-distribution generalization which assumes that some aspects of the data distribution vary across the training set but that the underlying causal mechanisms remain constant. Recently, Arjovsky et al. (2019) proposed Invariant Risk Minimization (IRM), an objective based on this idea for learning deep, invariant features of data which are a complex function of latent variables; many alternatives have subsequently been suggested. However, formal guarantees for all of these works are severely lacking. In this paper, we present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. In the linear case, we show simple conditions under which the optimal solution succeeds or, more often, fails to recover the optimal invariant predictor. We furthermore present the very first results in the non-linear regime: we demonstrate that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve. Thus, in this setting we find that IRM and its alternatives fundamentally do not improve over standard Empirical Risk Minimization.
翻译:挥发性岩浆预测(Peters等人,2016年)是一种传播性一般化技术(Peters等人,2016年),它假定数据分布的某些方面在培训中各不相同,但基本因果机制保持不变。最近,Arjovsky等人(2019年)提出了《挥发性风险最小化》(IRM),这是基于这一认识的一个目标,即了解数据具有潜伏变量复杂功能的深度和不变化性特征;后来又提出了许多替代方法;然而,所有这些工作都严重缺乏正式的保障。在本文中,我们首次分析了IMM目标下的分类,以及最近提出的替代方法在相当自然和一般的模型下。在线性案例中,我们展示了最佳解决方案成功或更经常无法恢复最佳挥发性预测器的简单条件。我们还介绍了非线性系统的第一个结果:我们证明,除非测试数据与培训分布足够相似,否则IMR可以灾难性地失败。这恰恰是它本来要解决的问题。因此,在这种背景下,我们发现IMRM及其最低风险替代方法根本上没有改进风险标准。