Open-domain conversational systems are assumed to generate equally good responses on multiple domains. Previous work achieved good performance on the single corpus, but training and evaluating on multiple corpora from different domains are less studied. This paper explores methods of generating relevant responses for each of multiple multi-domain corpora. We first examine interleaved learning which intermingles multiple corpora as the baseline. We then investigate two multi-domain learning methods, labeled learning and multi-task labeled learning, which encode each corpus through a unique corpus embedding. Furthermore, we propose Domain-specific Frequency (DF), a novel word-level importance weight that measures the relative importance of a word for a specific corpus compared to other corpora. Based on DF, we propose weighted learning, a method that integrates DF to the loss function. We also adopt DF as a new evaluation metric. Extensive experiments show that our methods gain significant improvements on both automatic and human evaluation. We share our code and data for reproducibility
翻译:假设开放域对话系统可以产生对多个领域的同样良好反应。 先前的工作在单体上取得了良好的表现, 但不同领域对多个组合的培训和评估研究较少。 本文探索了为多个多域组合的每个组合产生相关反应的方法。 我们首先研究将多个组合作为基线的跨左学习。 然后我们调查了两种多领域学习方法, 标签式学习和多任务标签式学习, 通过一个独特的组合嵌入将每个组合编码化。 此外, 我们提议了多域特有频率( DF), 一种新颖的单词级重要性权重, 用来衡量特定组合单词相对于其他组合的相对重要性。 基于 DF, 我们提出加权学习, 一种将 DF 整合到损失函数中的方法。 我们还将 DF 作为一种新的评价指标。 广泛的实验表明, 我们的方法在自动和人类评价上都得到了显著的改进。 我们分享了我们的代码和数据, 以便进行再生化。