The goal of Out-of-Distribution (OOD) generalization problem is to train a predictor that generalizes on all environments. Popular approaches in this field use the hypothesis that such a predictor shall be an \textit{invariant predictor} that captures the mechanism that remains constant across environments. While these approaches have been experimentally successful in various case studies, there is still much room for the theoretical validation of this hypothesis. This paper presents a new set of theoretical conditions necessary for an invariant predictor to achieve the OOD optimality. Our theory not only applies to non-linear cases, but also generalizes the necessary condition used in \citet{rojas2018invariant}. We also derive Inter Gradient Alignment algorithm from our theory and demonstrate its competitiveness on MNIST-derived benchmark datasets as well as on two of the three \textit{Invariance Unit Tests} proposed by \citet{aubinlinear}.
翻译:外分布( OOD) 常规化问题的目标是训练一个能概括所有环境的预测器。 该领域的流行方法使用这样的假设, 即这样的预测器应该是能够捕捉到各种环境中保持不变的机制的\ textit{ 变量预测器。 虽然这些方法在各种案例研究中都取得了实验性的成功, 但对于这一假设的理论验证仍有很大的空间。 本文展示了一套新的理论条件, 使一个变量预测器能够实现 OOOD 的最佳性。 我们的理论不仅适用于非线性案例, 而且还概括了在\ citet{rojas2018in variant} 中使用的必要条件。 我们还从我们的理论中推导出“ 梯度调整算法”, 并展示其在MNISST 衍生的基准数据集和citet{ aubinlinear 提议的三种 text { 单项测试中的两项的竞争力 。