Existing goal-oriented dialogue datasets focus mainly on identifying slots and values. However, customer support interactions in reality often involve agents following multi-step procedures derived from explicitly-defined company policies as well. To study customer service dialogue systems in more realistic settings, we introduce the Action-Based Conversations Dataset (ABCD), a fully-labeled dataset with over 10K human-to-human dialogues containing 55 distinct user intents requiring unique sequences of actions constrained by policies to achieve task success. We propose two additional dialog tasks, Action State Tracking and Cascading Dialogue Success, and establish a series of baselines involving large-scale, pre-trained language models on this dataset. Empirical results demonstrate that while more sophisticated networks outperform simpler models, a considerable gap (50.8% absolute accuracy) still exists to reach human-level performance on ABCD.
翻译:以目标为导向的现有对话数据集主要侧重于确定空档和价值;然而,客户支持互动在现实中往往涉及代理人遵循由明确规定的公司政策产生的多步程序;为了在更现实的环境中研究客户服务对话系统,我们引入了以行动为基础的对话数据集(ABCD),这是一个贴满标签的数据集,包含超过10K人与人之间对话的55种不同的用户意图,要求有受政策制约的独特行动序列,以成功完成任务。我们提出了另外两项对话任务,即《国家追踪行动》和《连锁对话成功》,并在该数据集上建立一系列包括大规模、预先培训的语言模型的基线。 经验结果表明,虽然更复杂的网络比更简单的模型要好,但在达到人际对话业绩方面仍然存在相当大的差距(50.8%的绝对准确性 ) 。