Contextual commonsense inference is the task of generating various types of explanations around the events in a dyadic dialogue, including cause, motivation, emotional reaction, and others. Producing a coherent and non-trivial explanation requires awareness of the dialogue's structure and of how an event is grounded in the context. In this work, we create CICEROv2, a dataset consisting of 8,351 instances from 2,379 dialogues, containing multiple human-written answers for each contextual commonsense inference question, representing a type of explanation on cause, subsequent event, motivation, and emotional reaction. We show that the inferences in CICEROv2 are more semantically diverse than other contextual commonsense inference datasets. To solve the inference task, we propose a collection of pre-training objectives, including concept denoising and utterance sorting to prepare a pre-trained model for the downstream contextual commonsense inference task. Our results show that the proposed pre-training objectives are effective at adapting the pre-trained T5-Large model for the contextual commonsense inference task.
翻译:环境常识推论是指在一场三角对话中对事件作出各种解释的任务,包括原因、动机、情感反应等。 提出一致和非三重解释需要了解对话的结构以及事件的背景。 在这项工作中,我们创建了CICEROv2数据集,由2,379个对话的8,351个实例组成,包含对每个背景常识推论问题的多种人写答案,代表了对原因、随后的事件、动机和情感反应的一种解释。我们表明,CICEROv2中的推断比其他背景常识推论数据集更具语义多样性。为了解决推论任务,我们建议收集培训前的目标,包括概念分辨和发音排序,以便为下游背景常识推论任务准备一个预先培训的模型。我们的结果显示,拟议的培训前目标有效地调整了经过培训的T5级模型,用于背景常识推论任务。