As pre-trained language models (LMs) continue to dominate NLP, it is increasingly important that we understand the depth of language capabilities in these models. In this paper, we target pre-trained LMs' competence in pragmatics, with a focus on pragmatics relating to discourse connectives. We formulate cloze-style tests using a combination of naturally-occurring data and controlled inputs drawn from psycholinguistics. We focus on testing models' ability to use pragmatic cues to predict discourse connectives, models' ability to understand implicatures relating to connectives, and the extent to which models show humanlike preferences regarding temporal dynamics of connectives. We find that although models predict connectives reasonably well in the context of naturally-occurring data, when we control contexts to isolate high-level pragmatic cues, model sensitivity is much lower. Models also do not show substantial humanlike temporal preferences. Overall, the findings suggest that at present, dominant pre-training paradigms do not result in substantial pragmatic competence in our models.
翻译:由于预先培训的语言模型(LMS)继续主导国家语言模型,我们越来越需要理解这些模型的语言能力的深度。在本文中,我们把受过培训的LMS的能力定位为务实,重点是与对话连接的实用性有关的实用性。我们使用自然生成的数据和从心理语言学中获取的受控投入组合来制定凝聚式测试。我们侧重于测试模型使用务实的线索来预测对话连接性的能力,模型理解连接性相关隐含的能力,模型显示模型在连接性时间动态方面人性化偏好的程度。我们发现,虽然模型在自然生成的数据方面预测连接性相当好,但我们控制环境以孤立高层次的实用提示时,模型的敏感性要低得多。模型也没有显示出与人性类似的大量时间偏好。总体而言,研究结果表明,目前占主导地位的培训前模式并没有在我们的模型中产生实质性的务实能力。