Spurious correlations are a threat to the trustworthiness of natural language processing systems, motivating research into methods for identifying and eliminating them. However, addressing the problem of spurious correlations requires more clarity on what they are and how they arise in language data. Gardner et al (2021) argue that the compositional nature of language implies that \emph{all} correlations between labels and individual "input features" are spurious. This paper analyzes this proposal in the context of a toy example, demonstrating three distinct conditions that can give rise to feature-label correlations in a simple PCFG. Linking the toy example to a structured causal model shows that (1) feature-label correlations can arise even when the label is invariant to interventions on the feature, and (2) feature-label correlations may be absent even when the label is sensitive to interventions on the feature. Because input features will be individually correlated with labels in all but very rare circumstances, domain knowledge must be applied to identify spurious correlations that pose genuine robustness threats.
翻译:Gardner等人(2021年)认为,语言的构成性质意味着标签和个人“投入特征”之间的关联是虚假的。本文在一个玩具例子中分析了这一建议,展示了三种不同的条件,在简单的PCFG中可以产生特征标签关联。 将玩具与结构化的因果关系模型联系起来表明:(1) 即使在标签与特征干预无关的情况下,也可能出现特征标签关联;(2) 即使标签对特征的干预敏感,特性标签关联也可能不存在。由于输入特征与所有但非常罕见情况下的标签都有个别关联,因此必须运用域域知识来识别构成真正强健性威胁的虚假关联。