We study the problem of out-of-distribution (o.o.d.) generalization where spurious correlations of attributes vary across training and test domains. This is known as the problem of correlation shift and has posed concerns on the reliability of machine learning. In this work, we introduce the concepts of direct and indirect effects from causal inference to the domain generalization problem. We argue that models that learn direct effects minimize the worst-case risk across correlation-shifted domains. To eliminate the indirect effects, our algorithm consists of two stages: in the first stage, we learn an indirect-effect representation by minimizing the prediction error of domain labels using the representation and the class label; in the second stage, we remove the indirect effects learned in the first stage by matching each data with another data of similar indirect-effect representation but of different class label. We also propose a new model selection method by matching the validation set in the same way, which is shown to improve the generalization performance of existing models on correlation-shifted datasets. Experiments on 5 correlation-shifted datasets and the DomainBed benchmark verify the effectiveness of our approach.
翻译:我们研究分配外(o.o.d.)的概括问题,因为培训和测试领域的属性的虚假相关性各不相同。这被称为关联性转变问题,对机器学习的可靠性提出了关切。在这项工作中,我们引入了因果推论产生的直接和间接影响概念,以至领域概括化问题。我们争辩说,那些了解直接效应的模型可以将相关变换领域的最坏风险降到最低程度。为了消除间接影响,我们的算法包括两个阶段:在第一阶段,我们通过使用代表和类标签尽量减少域标签的预测误差来学习间接效应;在第二阶段,我们通过将每个数据与另一个类似间接效应但不同类别标签的数据相匹配来消除在第一阶段学到的间接效应。我们还提出一个新的模式选择方法,以同样的方式匹配验证数据集,这可以改善现有模型在相关变换数据集上的通用性表现。关于5个相关变换数据集的实验和DoneBed基准验证了我们的方法的有效性。