Recently, various neural models for multi-party conversation (MPC) have achieved impressive improvements on a variety of tasks such as addressee recognition, speaker identification and response prediction. However, these existing methods on MPC usually represent interlocutors and utterances individually and ignore the inherent complicated structure in MPC which may provide crucial interlocutor and utterance semantics and would enhance the conversation understanding process. To this end, we present MPC-BERT, a pre-trained model for MPC understanding that considers learning who says what to whom in a unified model with several elaborated self-supervised tasks. Particularly, these tasks can be generally categorized into (1) interlocutor structure modeling including reply-to utterance recognition, identical speaker searching and pointer consistency distinction, and (2) utterance semantics modeling including masked shared utterance restoration and shared node detection. We evaluate MPC-BERT on three downstream tasks including addressee recognition, speaker identification and response selection. Experimental results show that MPC-BERT outperforms previous methods by large margins and achieves new state-of-the-art performance on all three downstream tasks at two benchmarks.
翻译:最近,多党派对话的各种神经模型(MPC)在对象识别、语音识别和响应预测等各种任务方面取得了令人印象深刻的改进,然而,关于MPC的现有方法通常代表对话者和个别的言论,忽视MPC固有的复杂结构,这种结构可能提供关键的对话和语义语义,并将加强对话理解过程。为此,我们提出MPC-BERT,这是MPC一个经过预先培训的模型,它考虑到学习者在统一模式中向谁说明什么,并有几项精心拟订的自我监督任务。特别是,这些任务一般可分为:(1) 中间结构模型,包括答复对发音的识别、相同的语音搜索和指示的一致性区分,以及(2) 语音语义模型,包括隐蔽的共同话语义恢复和共同节点探测。我们评估MPC-BERT的三项下游任务,包括收件人识别、语音识别和响应选择。实验结果表明,MPC-BERT在两个基准下游任务上超越了以前的方法,以大边距为基准,在所有三个下游任务上实现了新的状态。