Entity tags in human-machine dialog are integral to natural language understanding (NLU) tasks in conversational assistants. However, current systems struggle to accurately parse spoken queries with the typical use of text input alone, and often fail to understand the user intent. Previous work in linguistics has identified a cross-language tendency for longer speech pauses surrounding nouns as compared to verbs. We demonstrate that the linguistic observation on pauses can be used to improve accuracy in machine-learnt language understanding tasks. Analysis of pauses in French and English utterances from a commercial voice assistant shows the statistically significant difference in pause duration around multi-token entity span boundaries compared to within entity spans. Additionally, in contrast to text-based NLU, we apply pause duration to enrich contextual embeddings to improve shallow parsing of entities. Results show that our proposed novel embeddings improve the relative error rate by up to 8% consistently across three domains for French, without any added annotation or alignment costs to the parser.
翻译:人机对话框中的实体标记是自然语言理解(NLU)在谈话助理中的任务所不可或缺的。 但是,当前的系统很难精确地分析口述询问,只使用典型的文本输入,往往无法理解用户的意图。 语言学的以往工作发现,与动词相比,在名词周围的语句暂停时间有跨语言的趋势。 我们证明,暂停时间的语言观察可用于提高机器语言理解任务的准确性。 商业语音助理对法语和英语语句暂停时间的分析显示,与实体范围内相比,多端实体之间的暂停时间在统计上有很大差异。 此外,与基于文本的NLU不同,我们应用暂停时间来丰富背景嵌入,以改善实体的浅分辨。结果显示,我们拟议的新书嵌入时间使法国三个域的相对错误率不断提高到8%,而没有给Parser添加说明或调整成本。