Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of the structure of discourse. According to the theory, local coherence of discourse arises from the manner and extent to which successive utterances make reference to the same entities. In this paper, we investigate the connection between centering theory and modern coreference resolution systems. We provide an operationalization of centering and systematically investigate if neural coreference resolvers adhere to the rules of centering theory by defining various discourse metrics and developing a search-based methodology. Our information-theoretic analysis reveals a positive dependence between coreference and centering; but also shows that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas. Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations. Finally, we discuss factors that contribute to coreference which are not modeled by CT such as world knowledge and recency bias. We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.
翻译:中心理论(CT;Grosz等人,1995年) 提供了对讨论结构的语言分析。 根据理论,地方的理论一致性来自连续的言论引用相同实体的方式和程度。 在本文中,我们调查中心理论与现代共同参考分辨率系统之间的联系。我们提供中心化和系统调查,如果神经共参照决心者通过定义各种讨论指标和开发基于搜索的方法而遵守中心理论规则。我们的信息理论分析显示,共同参照和中心之间有积极的依赖性;但也表明,高质量的神经共参照决心者可能无法从明确建模中心思想中受益。我们的分析进一步表明,背景化嵌入包含大量一致性信息,有助于解释为什么CT只能为使用预先培训的表述的现代神经共参照决心者提供微小的收益。最后,我们讨论有助于协作的因素,这些不是以CT为模范的,例如世界知识和耐久性偏见为模范。我们制作了CT的版本,也能够建模耐久,并表明它能够更好地将信息与CCT联系起来。