Commonsense inference to understand and explain human language is a fundamental research problem in natural language processing. Explaining human conversations poses a great challenge as it requires contextual understanding, planning, inference, and several aspects of reasoning including causal, temporal, and commonsense reasoning. In this work, we introduce CIDER -- a manually curated dataset that contains dyadic dialogue explanations in the form of implicit and explicit knowledge triplets inferred using contextual commonsense inference. Extracting such rich explanations from conversations can be conducive to improving several downstream applications. The annotated triplets are categorized by the type of commonsense knowledge present (e.g., causal, conditional, temporal). We set up three different tasks conditioned on the annotated dataset: Dialogue-level Natural Language Inference, Span Extraction, and Multi-choice Span Selection. Baseline results obtained with transformer-based models reveal that the tasks are difficult, paving the way for promising future research. The dataset and the baseline implementations are publicly available at https://cider-task.github.io/cider/.
翻译:理解和解释人类语言的常识推理是自然语言处理中的一个基本研究问题。解释人类对话是一个巨大的挑战,因为它需要背景理解、规划、推断以及推理的几个方面,包括因果关系、时间性和常识推理。在这项工作中,我们引入了CIDER -- -- 人工整理的数据集,其中包含以隐含和显性知识三重来用背景常识推理推断的隐含和显性知识三重来表达的三角对话解释。从谈话中提取如此丰富的解释有助于改进几个下游应用。附加说明的三重知识按现有的常识知识类型分类(例如,因果、有条件、时间性)。我们在附加说明的数据集中设定了三项不同的任务:对话层次的自然语言推理、斯潘采掘和多切切斯潘选择。通过基于变压模型获得的基线结果显示,任务十分困难,为有前途的未来研究铺平了道路。数据集和基线执行情况公布在https://cider-task.github.io/cider/。