Conditional Variational AutoEncoder (CVAE) effectively increases the diversity and informativeness of responses in open-ended dialogue generation tasks through enriching the context vector with sampled latent variables. However, due to the inherent one-to-many and many-to-one phenomena in human dialogues, the sampled latent variables may not correctly reflect the contexts' semantics, leading to irrelevant and incoherent generated responses. To resolve this problem, we propose Self-separated Conditional Variational AutoEncoder (abbreviated as SepaCVAE) that introduces group information to regularize the latent variables, which enhances CVAE by improving the responses' relevance and coherence while maintaining their diversity and informativeness. SepaCVAE actively divides the input data into groups, and then widens the absolute difference between data pairs from distinct groups, while narrowing the relative distance between data pairs in the same group. Empirical results from automatic evaluation and detailed analysis demonstrate that SepaCVAE can significantly boost responses in well-established open-domain dialogue datasets.
翻译:有条件变化自动编码器(CVAE)通过以样本潜伏变量丰富背景矢量,有效增加了开放式对话生成任务中答复的多样性和信息性,但是,由于人类对话中固有的一对一现象和多对一现象,抽样潜在变量可能无法正确反映背景的语义,导致不相干和不相容的响应。为了解决这一问题,我们提议自我分离的有条件变化自动编码器(以SepaCVAE为缩放),介绍群体信息,使潜在变量正规化,通过改进回复的相关性和一致性,同时保持其多样性和信息性,增强常态变异器。 SepaCVAE积极将输入数据分成各组,然后扩大不同组的数据配对之间的绝对差异,同时缩小同一组中数据配对之间的相对距离。自动评估和详细分析的结果表明,SepaCVAE可以极大地促进在既定的开放式对话数据集中的反应。