Visual representations underlie object recognition tasks, but they often contain both robust and non-robust features. Our main observation is that image classifiers may perform poorly on out-of-distribution samples because spurious correlations between non-robust features and labels can be changed in a new environment. By analyzing procedures for out-of-distribution generalization with a causal graph, we show that standard classifiers fail because the association between images and labels is not transportable across settings. However, we then show that the causal effect, which severs all sources of confounding, remains invariant across domains. This motivates us to develop an algorithm to estimate the causal effect for image classification, which is transportable (i.e., invariant) across source and target environments. Without observing additional variables, we show that we can derive an estimand for the causal effect under empirical assumptions using representations in deep models as proxies. Theoretical analysis, empirical results, and visualizations show that our approach captures causal invariances and improves overall generalization.
翻译:目标识别任务背后的视觉表现, 但它们往往包含强势和非强力特征。 我们的主要观察是图像分类者在分布外样本上的表现可能很差, 因为非紫外特征和标签之间的虚假关联可以在新的环境中改变。 通过分析分布外概括程序, 并使用因果图, 我们显示标准分类者失败, 因为图像和标签之间的联系无法在各种环境中相互迁移。 但是, 我们随后会显示, 将所有混杂来源分割在一起的因果效应仍然存在于各域间。 这促使我们开发一种算法, 来估计图像分类的因果效应, 图像分类可以跨源和目标环境( 变异性) 。 在不观察其他变量的情况下, 我们显示, 在实验假设下, 我们可以用深层模型中的表达方式作为准轴。 理论分析、 实验结果和可视化表明, 我们的方法可以捕捉因果关系, 并改进总体的概括性。