Large-scale pretrained language models have achieved outstanding performance on natural language understanding tasks. However, it is still under investigating how to apply them to dialogue generation tasks, especially those with responses conditioned on multiple sources. Previous work simply concatenates all input sources or averages information from different input sources. In this work, we study dialogue models with multiple input sources adapted from the pretrained language model GPT2. We explore various methods to fuse multiple separate attention information corresponding to different sources. Our experimental results show that proper fusion methods deliver higher relevance with dialogue history than simple fusion baselines.
翻译:大规模经过培训的语文模式在自然语言理解任务方面取得了杰出的成绩,然而,它仍在研究如何将其应用于对话产生任务,特别是那些以多种来源为条件的生成任务。先前的工作只是将来自不同投入来源的所有投入来源或平均信息集中在一起。在这项工作中,我们研究与从经过培训的语言模式GPT2改编的多种投入来源的对话模式。我们探索了各种方法,以融合与不同来源相对应的多重单独关注信息。我们的实验结果表明,适当的融合方法比简单的融合基线对对话历史具有更高的相关性。