Current Spoken Dialogue Systems (SDSs) often serve as passive listeners that respond only after receiving user speech. To achieve human-like dialogue, we propose a novel future prediction architecture that allows an SDS to anticipate future affective reactions based on its current behaviors before the user speaks. In this work, we investigate two scenarios: speech and laughter. In speech, we propose to predict the user's future emotion based on its temporal relationship with the system's current emotion and its causal relationship with the system's current Dialogue Act (DA). In laughter, we propose to predict the occurrence and type of the user's laughter using the system's laughter behaviors in the current turn. Preliminary analysis of human-robot dialogue demonstrated synchronicity in the emotions and laughter displayed by the human and robot, as well as DA-emotion causality in their dialogue. This verifies that our architecture can contribute to the development of an anticipatory SDS.
翻译:目前的口语对话系统(SDS)通常作为被动的倾听者,只有在收到用户演讲后才会作出反应。为了实现人性化对话,我们提出一个新的未来预测结构,使SDS能够根据用户发言之前的当前行为预测未来的情感反应。在这项工作中,我们调查两种情景:言语和笑声。在演讲中,我们提议根据用户与系统当前情感的时际关系及其与系统当前对话法(DA)的因果关系预测用户的未来情感。在笑声中,我们提议利用系统当前转折中的笑声行为预测用户笑声的发生和类型。对人-机器人对话的初步分析显示人类和机器人所展示的情感和笑声的同步性,以及其对话中的度 - 度的因果关系。这证实了我们的架构能够促进对抗性SDS的发展。</s>