We propose a dialog system utility component that gets the two last utterances of a user and can detect whether the last utterance is an error correction of the second last utterance. If yes, it corrects the second last utterance according to the error correction in the last utterance. In addition, the proposed component outputs the extracted pairs of reparandum and repair entity. This component offers two advantages, learning the concept of corrections to avoid collecting corrections for every new domain and extracting reparandum and repair pairs, which offers the possibility to learn out of it. For the error correction one sequence labeling and two sequence to sequence approaches are presented. For the error correction detection these three error correction approaches can also be used and in addition, we present a sequence classification approach. One error correction detection and one error correction approach can be combined to a pipeline or the error correction approaches can be trained and used end-to-end to avoid two components. We modified the EPIC-KITCHENS-100 dataset to evaluate the approaches for correcting entity phrases in request dialogs. For error correction detection and correction, we got an accuracy of 96.40 % on synthetic validation data and an accuracy of 77.85 % on human-created real-world test data.
翻译:我们提出了一个对话系统实用组件,可以获取用户的最后两个话语,并检测最后一个话语是否是对第二个最后话语的矫正。如果是,它会根据最后一个话语中的矫正对第二个最后话语进行纠正。此外,所提出的组件输出被修复和修复实体的提取对。该组件具有两个优点,一是学会矫正的概念,以避免为每个新领域收集矫正,并且提取被修复和修复对,这提供了学习的可能性。对于矫正,本文介绍了一种序列标注和两种序列到序列方法。对于矫正检测,这三种矫正方法也可以使用,此外,我们还提出了一种序列分类方法。一种矫正检测和一种矫正方法可以组合成一个流水线,或者可以对矫正方法进行端到端的训练和使用,以避免两个组件。我们改编了 EPIC-KITCHENS-100 数据集,以评估矫正请求对话中实体短语的方法。对于矫正检测和矫正,我们在合成验证数据上获得了96.40%的精度,在人类创建的真实世界测试数据上获得了77.85%的精度。