Tangled multi-party dialogue context leads to challenges for dialogue reading comprehension, where multiple dialogue threads flow simultaneously within the same dialogue history, thus increasing difficulties in understanding a dialogue history for both human and machine. Dialogue disentanglement aims to clarify conversation threads in a multi-party dialogue history, thus reducing the difficulty of comprehending the long disordered dialogue passage. Existing studies commonly focus on utterance encoding with carefully designed feature engineering-based methods but pay inadequate attention to dialogue structure. This work designs a novel model to disentangle multi-party history into threads, by taking dialogue structure features into account. Specifically, based on the fact that dialogues are constructed through successive participation of speakers and interactions between users of interest, we extract clues of speaker property and reference of users to model the structure of a long dialogue record. The novel method is evaluated on the Ubuntu IRC dataset and shows state-of-the-art experimental results in dialogue disentanglement.
翻译:对话的分解旨在澄清多党对话史上的谈话线索,从而减少理解长期无序对话通道的困难。现有研究通常侧重于用精心设计的基于工程的特征方法的发音编码,但不够注意对话结构。这项工作设计了一个新颖的模式,通过考虑到对话结构特征,将多党历史分解为线。具体地说,基于通过演讲者连续参与和感兴趣的用户之间的互动来构建对话,我们提取演讲者财产的线索,并参考用户对长期对话记录结构的建模。新颖方法在Ubuntu IRC数据集上进行了评估,并展示了对话脱钩过程中最先进的实验结果。