The Invariant Risk Minimization (IRM) principle was first proposed by Arjovsky et al. [2019] to address the domain generalization problem by leveraging data heterogeneity from differing experimental conditions. Specifically, IRM seeks to find a data representation under which an optimal classifier remains invariant across all domains. Despite the conceptual appeal of IRM, the effectiveness of the originally proposed invariance penalty has recently been brought into question. In particular, there exists counterexamples for which that invariance penalty can be arbitrarily small for non-invariant data representations. We propose an alternative invariance penalty by revisiting the Gramian matrix of the data representation. We discuss the role of its eigenvalues in the relationship between the risk and the invariance penalty, and demonstrate that it is ill-conditioned for said counterexamples. The proposed approach is guaranteed to recover an invariant representation for linear settings under mild non-degeneracy conditions. Its effectiveness is substantiated by experiments on DomainBed and InvarianceUnitTest, two extensive test beds for domain generalization.
翻译:Arjovsky 等人([2019] ) 首次提出了 " 差异风险最小化(IRM) " 原则,目的是通过利用不同实验条件下的数据异质性来解决领域普遍性问题,具体而言,IRM力求找到一种数据代表,根据这种数据代表,最佳分类者在所有领域都始终是无差异的。尽管IRM在概念上具有吸引力,但最初提议的差异最小化惩罚的有效性最近受到质疑。特别是,存在着反实例,对非差异性数据表示的量刑可能任意地小于非差异性数据表示的处罚。我们建议了另一种替代的变异性处罚,即重新审视数据代表的格莱米安矩阵。我们讨论了其在风险和差异性处罚之间的关系中独有价值的作用,并表明它没有为上述反典型规定条件。拟议的方法保证在轻度非差异性条件下恢复线性环境的异性代表,其有效性通过在两种广域通用测试床 " DomainBed and Inverance Unitiest " 的实验得到证实。