Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this question by providing a language for describing how training data influences predictions, through a causal framework. Importantly, our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone. Addressing the problem of extracting factual knowledge from pretrained language models (PLMs), we focus on simple data statistics such as co-occurrence counts and show that these statistics do influence the predictions of PLMs, suggesting that such models rely on shallow heuristics. Our causal framework and our results demonstrate the importance of studying datasets and the benefits of causality for understanding NLP models.
翻译:大量的训练数据是现代自然语言处理模型高性能的主要原因之一。但是,在训练数据中到底是什么导致了模型做出特定的预测?我们试图通过提供一个因果框架来回答这个问题,描述训练数据如何影响预测。重要的是,我们的框架避免了需要重新训练昂贵模型的需要,并允许我们仅基于观察数据就可以估算因果效应。针对从预训练语言模型(PLMs)中提取事实知识的问题,我们关注简单的数据统计,如共现计数,并显示这些统计确实影响了PLM的预测,表明这些模型依赖于浅层启发式。我们的因果框架和结果证明了研究数据集的重要性,并展示了因果关系对于理解自然语言处理模型的好处。