The development of Open-Domain Dialogue Systems (ODS)is a trending topic due to the large number of research challenges, large societal and business impact, and advances in the underlying technology. However, the development of these kinds of systems requires two important characteristics:1) automatic evaluation mechanisms that show high correlations with human judgements across multiple dialogue evaluation aspects (with explainable features for providing constructive and explicit feedback on the quality of generative models' responses for quick development and deployment)and 2) mechanisms that can help to control chatbot responses,while avoiding toxicity and employing intelligent ways to handle toxic user comments and keeping interaction flow and engagement. This track at the 10th Dialogue System Technology Challenge (DSTC10) is part of the ongoing effort to promote scalable and toxic-free ODS. This paper describes the datasets and baselines provided to participants, as well as submission evaluation results for each of the two proposed subtasks.
翻译:由于大量研究挑战、巨大的社会和商业影响以及基础技术的进步,开发开放域对话系统(ODS)是一个趋势性议题,但是,这些系统的发展需要两个重要特点:1) 自动评价机制,在多个对话评价方面显示与人类判断的高度相关性(具有解释性特征,就基因化模型对快速开发和部署的反应的质量提供建设性和明确的反馈),2) 机制,可以帮助控制聊天室反应,同时避免毒性,采用智能方法处理有毒用户的评论,保持互动性流动和参与。第十个对话系统技术挑战(DSTC10)的追踪是正在进行的促进可缩放和无毒的ODS努力的一部分。本文介绍了向参与者提供的数据集和基线,以及就拟议的两个子任务中的每一项提交评价结果。