What is the relationship between linguistic dependencies and statistical dependence? Building on earlier work in NLP and cognitive science, we study this question. We introduce a contextualized version of pointwise mutual information (CPMI), using pretrained language models to estimate probabilities of words in context. Extracting dependency trees which maximize CPMI, we compare the resulting structures against gold dependencies. Overall, we find that these maximum-CPMI trees correspond to linguistic dependencies more often than trees extracted from non-contextual PMI estimate, but only roughly as often as a simple baseline formed by connecting adjacent words. We also provide evidence that the extent to which the two kinds of dependency align cannot be explained by the distance between words or by the category of the dependency relation. Finally, our analysis sheds some light on the differences between large pretrained language models, specifically in the kinds of inductive biases they encode.
翻译:语言依赖性和统计依赖性之间的关系是什么? 语言依赖性和统计依赖性之间的关系是什么? 在早期国家语言平台和认知科学工作的基础上,我们研究这一问题。我们引入了点对点相互信息的背景化版本(CPMI),使用预先培训的语言模型来估计文字在背景中的概率。采掘最大语言依赖性树,我们将由此产生的结构与黄金依赖性进行比较。总体而言,我们发现这些最大水平的CPMI树与语言依赖性相对比从非逻辑的PMI估计中提取的树木更经常地符合语言依赖性,但仅与连接相邻文字形成的简单基线相近。我们还提供了证据,说明两种依赖性一致的程度不能以语言之间的距离或依赖关系类别来解释。最后,我们的分析揭示出大量未经培训的语言模型之间的差别,特别是它们所编码的直导偏差类型。