The ultimate goal of dialog research is to develop systems that can be effectively used in interactive settings by real users. To this end, we introduced the Interactive Evaluation of Dialog Track at the 9th Dialog System Technology Challenge. This track consisted of two sub-tasks. The first sub-task involved building knowledge-grounded response generation models. The second sub-task aimed to extend dialog models beyond static datasets by assessing them in an interactive setting with real users. Our track challenges participants to develop strong response generation models and explore strategies that extend them to back-and-forth interactions with real users. The progression from static corpora to interactive evaluation introduces unique challenges and facilitates a more thorough assessment of open-domain dialog systems. This paper provides an overview of the track, including the methodology and results. Furthermore, it provides insights into how to best evaluate open-domain dialog models
翻译:对话研究的最终目标是开发能够被真正用户在互动环境中有效使用的系统。为此,我们介绍了第9次对话系统技术挑战中的对对话框的交互评估。这个轨道由两个子任务组成。第一个子任务涉及建立基于知识的响应生成模型。第二个子任务旨在通过在与实际用户互动的环境下对静态数据集进行评估,将对话模型扩展到静态数据集之外。我们的跟踪任务旨在将对话模型扩展至与真实用户互动环境中的静态数据集之外。我们的跟踪参与者在开发强大的响应生成模型和探索将其扩展至与真实用户的后期和后期互动的战略方面遇到挑战。从静态 Corpora到交互式评估,提出了独特的挑战,便于对开放式对话系统进行更彻底的评估。本文概述了该轨迹,包括方法和结果。此外,它提供了如何最好地评价开放式对话模型的洞察力。