Humans can seamlessly reason with circumstantial preconditions of commonsense knowledge. We understand that a glass is used for drinking water, unless the glass is broken or the water is toxic. Despite state-of-the-art (SOTA) language models' (LMs) impressive performance on inferring commonsense knowledge, it is unclear whether they understand the circumstantial preconditions. To address this gap, we propose a novel challenge of reasoning with circumstantial preconditions. We collect a dataset, called PaCo, consisting of 12.4 thousand preconditions of commonsense statements expressed in natural language. Based on this dataset, we create three canonical evaluation tasks and use them to examine the capability of existing LMs to understand situational preconditions. Our results reveal a 10-30% gap between machine and human performance on our tasks, which shows that reasoning with preconditions is an open challenge.
翻译:人类可以用常识知识的间接先决条件无缝理性。 我们理解,玻璃用于饮用水,除非玻璃破碎或水有毒。 尽管最先进的(SOTA)语言模型在推断常识方面的表现令人印象深刻,但尚不清楚他们是否理解间接先决条件。为了解决这一差距,我们提出了一个新的推理挑战。我们收集了一个数据集,称为PaCo,由以自然语言表达的共识声明的12 400个先决条件组成。基于这一数据集,我们设置了三种卡通性评价任务,并利用它们审查现有LMs理解情况先决条件的能力。我们的结果显示,机器和人类在执行任务上存在10-30%的差距,这表明带有先决条件的推理是一个公开的挑战。