Training dialogue systems often entails dealing with noisy training examples and unexpected user inputs. Despite their prevalence, there currently lacks an accurate survey of dialogue noise, nor is there a clear sense of the impact of each noise type on task performance. This paper addresses this gap by first constructing a taxonomy of noise encountered by dialogue systems. In addition, we run a series of experiments to show how different models behave when subjected to varying levels of noise and types of noise. Our results reveal that models are quite robust to label errors commonly tackled by existing denoising algorithms, but that performance suffers from dialogue-specific noise. Driven by these observations, we design a data cleaning algorithm specialized for conversational settings and apply it as a proof-of-concept for targeted dialogue denoising.
翻译:培训对话系统往往需要处理吵闹的培训实例和意外用户投入。尽管它们很普遍,但目前缺乏对对话噪音的准确调查,也没有清楚地了解每种噪音对任务性能的影响。本文首先通过建立对话系统遇到的噪音分类来弥补这一差距。此外,我们进行了一系列实验,以显示不同模式在受到不同程度的噪音和噪音影响时的行为方式。我们的结果显示,模型非常坚固,可以标出现有除去性算法通常处理的错误,但工作表现受到对话特有的噪音的影响。在观察的驱使下,我们设计了专门用于谈话环境的数据清理算法,并用作有针对性对话解密的证明概念。