Domain generalization asks for models trained on a set of training environments to perform well on unseen test environments. Recently, a series of algorithms such as Invariant Risk Minimization (IRM) has been proposed for domain generalization. However, Rosenfeld et al. (2021) shows that in a simple linear data model, even if non-convexity issues are ignored, IRM and its extensions cannot generalize to unseen environments with less than $d_s+1$ training environments, where $d_s$ is the dimension of the spurious-feature subspace. In this paper, we propose to achieve domain generalization with Invariant-feature Subspace Recovery (ISR). Our first algorithm, ISR-Mean, can identify the subspace spanned by invariant features from the first-order moments of the class-conditional distributions, and achieve provable domain generalization with $d_s+1$ training environments under the data model of Rosenfeld et al. (2021). Our second algorithm, ISR-Cov, further reduces the required number of training environments to $O(1)$ using the information of second-order moments. Notably, unlike IRM, our algorithms bypass non-convexity issues and enjoy global convergence guarantees. Empirically, our ISRs can obtain superior performance compared with IRM on synthetic benchmarks. In addition, on three real-world image and text datasets, we show that ISR-Mean can be used as a simple yet effective post-processing method to increase the worst-case accuracy of trained models against spurious correlations and group shifts.
翻译:域常规化要求在一组培训环境中培训模型,以在看不见的测试环境中很好地运行。最近,提出了一系列的算法,如不易变风险最小化(IRM),供域化使用。然而,罗森费尔德等人(2021年)指出,在一个简单的线性数据模型中,即使不易变化问题被忽视,综合管理及其扩展无法在低于美元+1美元的培训环境中向无形环境推广,而美元+1美元的培训环境是随机性能精度精准亚空间的维度。在本文件中,我们提议用不易变异风险最小化亚空间回收(IRM)等一系列算法实现域化。我们的第一个算法,即ISR-MEan(ISR-MEan),可以识别从班级条件分布的第一阶时段到不变异的子空间宽度,在罗森费尔德等人(2021年)的数据模型下,以美元+1美元的培训环境实现可变化。我们的第二算法,即ISR-Cov,进一步将所需培训环境降至$O(1)美元与不易变异性亚(IRS-comnialalalal)基准化),但以全球数据快速化的运行运行运行运行运行运行表现可比比。