To what extent do pre-trained language models grasp semantic knowledge regarding the phenomenon of distributivity? In this paper, we introduce DistNLI, a new diagnostic dataset for natural language inference that targets the semantic difference arising from distributivity, and employ the causal mediation analysis framework to quantify the model behavior and explore the underlying mechanism in this semantically-related task. We find that the extent of models' understanding is associated with model size and vocabulary size. We also provide insights into how models encode such high-level semantic knowledge.
翻译:培训前语言模型在多大程度上掌握了有关分配现象的语义学知识? 在本文中,我们引入了DetNLI,这是针对分配性产生的语义差异的自然语言推断的新诊断数据集,并使用因果调解分析框架来量化模式行为和探索这一与语义相关任务的基本机制。我们发现,模型的理解程度与模型大小和词汇大小相关。我们还提供了如何将这种高级语义学知识编码的模型的洞察力。