Efficient discovery of emotion states of speakers in a multi-party conversation is highly important to design human-like conversational agents. During the conversation, the cognitive state of a speaker often alters due to certain past utterances, which may lead to a flip in her emotion state. Therefore, discovering the reasons (triggers) behind one's emotion flip during conversation is important to explain the emotion labels of individual utterances. In this paper, along with addressing the task of emotion recognition in conversations (ERC), we introduce a novel task -- Emotion Flip Reasoning (EFR) that aims to identify past utterances which have triggered one's emotion state to flip at a certain time. We propose a masked memory network to address the former and a Transformer-based network for the latter task. To this end, we consider MELD, a benchmark emotion recognition dataset in multi-party conversations for the task of ERC and augment it with new ground-truth labels for EFR. An extensive comparison with four state-of-the-art models suggests improved performances of our models for both the tasks. We further present anecdotal evidences and both qualitative and quantitative error analyses to support the superiority of our models compared to the baselines.
翻译:在多党对话中有效发现发言者的情绪状态对于设计像人一样的谈话媒介非常重要。在对话中,一个发言者的认知状态常常由于某些过去的言辞而改变,这可能导致她的情绪状态发生翻转。因此,在对话中发现一个人的情感翻转背后的原因(触发因素)对于解释个人言论的情感标签非常重要。在本文件中,除了在对话中处理情感识别的任务外,我们还引入了一个新任务 -- -- 情感翻转理性(EFR),目的是查明导致一个人情绪状态在一定时间翻转的过去言论。我们提议一个隐藏的记忆网络来处理前一种情绪状态,而后一种基于变异器的网络。为此,我们认为MELD,在多党对话中为 ERC 的任务设定一个情感识别数据基准,并增加EFR的新地面路径标签。与四种最先进的模式进行广泛的比较,表明我们模式在两个任务中的表现都得到了更好的支持。我们进一步展示了一种理论性证据,并将定性和定量错误的分析与我们的模型进行比较。