Informally, a 'spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter. In machine learning, these have a know-it-when-you-see-it character; e.g., changing the gender of a sentence's subject changes a sentiment predictor's output. To check for spurious correlations, we can 'stress test' models by perturbing irrelevant parts of input data and seeing if model predictions change. In this paper, we study stress testing using the tools of causal inference. We introduce counterfactual invariance as a formalization of the requirement that changing irrelevant parts of the input shouldn't change model predictions. We connect counterfactual invariance to out-of-domain model performance, and provide practical schemes for learning (approximately) counterfactual invariant predictors (without access to counterfactual examples). It turns out that both the means and implications of counterfactual invariance depend fundamentally on the true underlying causal structure of the data -- in particular, whether the label causes the features or the features cause the label. Distinct causal structures require distinct regularization schemes to induce counterfactual invariance. Similarly, counterfactual invariance implies different domain shift guarantees depending on the underlying causal structure. This theory is supported by empirical results on text classification.
翻译:非正式地说, “ 净关联” 是模型对输入数据的某些方面的依赖性, 分析员认为输入数据不应重要。 在机器学习中, 这些模型具有一个“ 知道- 当- 当- 当- 观察- 观察- 观察- ” 的特性; 例如, 改变句子主题的性别会改变情绪预测器的输出。 为了检查虚假关联性, 我们可以通过扰动输入数据中无关的部分来“ 压力测试” 模型, 并查看模型预测是否发生变化。 在本文中, 我们研究使用因果推断工具来进行压力测试。 我们引入反事实偏差作为要求的正规化, 要求改变投入中无关部分不应该改变模型预测。 我们把反事实变异性连接到主模型的性表现, 提供实用的学习计划( 约) 反真实性预测器( 没有反事实实例 ) 。 事实证明, 反真实性变异的手段和影响从根本上取决于数据的真正根本的因果结构 - 特别是, 标签是否导致这些特性或特性的变化不能改变模型的模型的模型预测性 。 不同的因果性结构需要不同的正正的理论结构 。 。 要求不同的因果变正的理结构要求 。