As task-oriented dialog systems are becoming increasingly popular in our lives, more realistic tasks have been proposed and explored. However, new practical challenges arise. For instance, current dialog systems cannot effectively handle multiple search results when querying a database, due to the lack of such scenarios in existing public datasets. In this paper, we propose Database Search Result (DSR) Disambiguation, a novel task that focuses on disambiguating database search results, which enhances user experience by allowing them to choose from multiple options instead of just one. To study this task, we augment the popular task-oriented dialog datasets (MultiWOZ and SGD) with turns that resolve ambiguities by (a) synthetically generating turns through a pre-defined grammar, and (b) collecting human paraphrases for a subset. We find that training on our augmented dialog data improves the model's ability to deal with ambiguous scenarios, without sacrificing performance on unmodified turns. Furthermore, pre-fine tuning and multi-task learning help our model to improve performance on DSR-disambiguation even in the absence of in-domain data, suggesting that it can be learned as a universal dialog skill. Our data and code will be made publicly available.
翻译:随着以任务为导向的对话系统在我们的生活中越来越受欢迎,提出了更现实的任务,并探索了更现实的任务。然而,出现了新的实际挑战。例如,由于现有公共数据集中缺乏这类假设,目前的对话系统无法在查询数据库时有效处理多重搜索结果。在本文件中,我们提议数据库搜索结果(DSR)差异,这是一项新颖的任务,重点是模糊不清的数据库搜索结果,通过允许用户从多种选项中选择而不是只选择一个选项来提高用户的经验。为了研究这项任务,我们增加了以任务为导向的对话数据集(MultiWOZ和SGD),通过以下途径解决模糊不清问题:(a)通过预先定义的语法合成生成旋转,以及(b)为子集集收集人文说明。我们发现,关于我们增强的对话数据的培训提高了模型处理模糊的假想的能力,同时不牺牲未调整的旋转的性能。此外,预先调整和多任务学习有助于我们改进 DSR-模糊性能的模型,即使在没有内部数据的情况下,也帮助我们改进了模型,从而解决模糊性的工作。