We study open domain response generation with limited message-response pairs. The problem exists in real-world applications but is less explored by the existing work. Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data. The generation model is defined by an encoder-decoder architecture with templates as prior, where the templates are estimated from the unpaired data as a neural hidden semi-markov model. By this means, response generation learned from the small paired data can be aided by the semantic and syntactic knowledge in the large unpaired data. To balance the effect of the prior and the input message to response generation, we propose learning the whole generation model with an adversarial approach. Empirical studies on question response generation and sentiment response generation indicate that when only a few pairs are available, our model can significantly outperform several state-of-the-art response generation models in terms of both automatic and human evaluation.
翻译:我们用有限的对应信息来研究开放式域响应生成。 问题存在于现实世界应用中,但现有工作却较少探索。 由于配对数据现在不足以训练神经生成模型, 我们考虑利用大量更容易获得的无保护数据, 并提议使用配对和无保护数据来生成响应数据。 生成模型由具有模板的编码器- 解码器结构来定义, 模板从未保护数据中估算为神经隐藏半标记模式。 通过这个方法, 从小对称数据中学习的响应生成可以借助大型未受保护数据中的语义和合成知识。 为了平衡先前和输入信息对响应生成的影响, 我们提议用对抗方法来学习整代模式。 关于问题响应生成和情绪响应生成的“ 经验化” 研究表明,如果只有几对, 我们的模型可以在自动和人类评估中大大超越几种状态的生成反应模型。