The role of world knowledge has been particularly crucial to predict the discourse connective that marks the discourse relation between two arguments, with language models (LMs) being generally successful at this task. We flip this premise in our work, and instead study the inverse problem of understanding whether discourse connectives can inform LMs about the world. To this end, we present WUGNECTIVES, a dataset of 8,880 stimuli that evaluates LMs' inferences about novel entities in contexts where connectives link the entities to particular attributes. On investigating 17 different LMs at various scales, and training regimens, we found that tuning an LM to show reasoning behavior yields noteworthy improvements on most connectives. At the same time, there was a large variation in LMs' overall performance across connective type, with all models systematically struggling on connectives that express a concessive meaning. Our findings pave the way for more nuanced investigations into the functional role of language cues as captured by LMs. We release WUGNECTIVES at https://github.com/sheffwb/wugnectives.
翻译:世界知识在预测标记两个论元间话语关系的话语连接词方面起着尤为关键的作用,语言模型在此任务上通常表现成功。在我们的工作中,我们翻转了这一前提,转而研究一个逆问题:理解话语连接词能否向语言模型传递关于世界的知识。为此,我们提出了WUGNECTIVES数据集,包含8,880个刺激项,用于评估语言模型在连接词将实体与特定属性相链接的上下文中,对新实体的推理能力。通过考察17种不同规模及训练方案的多样化语言模型,我们发现,对语言模型进行微调以展现推理行为,能在大多数连接词上带来显著的性能提升。同时,不同连接词类型上语言模型的整体表现存在巨大差异,所有模型在表达让步意义的连接词上均表现出系统性困难。我们的发现为更细致地探究语言模型所捕捉的语言线索的功能作用铺平了道路。我们在https://github.com/sheffwb/wugnectives 发布了WUGNECTIVES数据集。