The performance of a task-completion dialogue agent usually affects the user experience: when the conversation system yields an unreasonable response, users may feel dissatisfied. Besides, early termination often occurs in disappointing conversations. However, existing off-the-shelf user simulators generally assume an ideal and cooperative user, which is somewhat different from a real user, and inevitably lead to a sub-optimal dialogue policy. In this paper, we propose an emotion-aware user simulation framework for task-oriented dialogue, which is based on the OCC emotion model to update user emotions and drive user actions, to generate simulated behaviors that more similar to real users. We present a linear implementation (The source code will be released soon.) that is easy to understand and extend, and evaluate it on two domain-specific datasets. The experimental results show that the emotional simulation results of our proposed framework conform to common sense and have good versatility for different domains. Meanwhile, our framework provides us with another perspective to understand the improvement process of the dialogue policy model based on reinforcement learning.
翻译:任务完成对话框的运行通常会影响用户的体验:当对话系统产生不合理的响应时,用户可能会感到不满意。此外,早期终止往往发生在令人失望的谈话中。然而,现有的现成用户模拟器通常假设一个理想和合作的用户,这与真正的用户有些不同,不可避免地导致一个亚于最佳的对话政策。在本文中,我们建议为以任务为导向的对话建立一个有情感觉悟的用户模拟框架,以 OCC 情感模式为基础,更新用户情绪,驱动用户行动,生成更类似于真实用户的模拟行为。我们提出了一个易于理解和扩展的线性实施(源代码将很快发布 ), 并在两个特定领域数据集上进行评估。实验结果显示,我们拟议框架的情感模拟结果符合常识,在不同领域具有良好的多变性。 同时,我们的框架为我们提供了另一种视角,以了解基于强化学习的对话政策模式的改进过程。