Learning models that are robust to distribution shifts is a key concern in the context of their real-life applicability. Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments. The success of IRM requires an important assumption: the underlying causal mechanisms/features remain invariant across environments. When not satisfied, we show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $\textit{partial invariance}$. In this work, we theoretically highlight the sub-optimality of IRM and then demonstrate how learning from a partition of training domains can help improve invariant models. Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
翻译:学习对分布变化具有鲁棒性的模型是其实际应用的关键。不变风险最小化(IRM)是一种流行的框架,旨在从多个环境中学习鲁棒模型。IRM的成功需要一个重要的假设:潜在的因果机制或特征在不同的环境中保持不变。当不满足此假设时,我们证明IRM可以过度约束预测器,并提出通过 “部分不变性” 来缓解这种情况。在本研究中,我们从理论上强调了IRM的亚最优性,并展示了如何从训练域的一个分区中学习可以帮助改进不变模型。我们在语言和图像数据上进行了线性设置和深度神经网络的多个实验,以验证我们的结论。