End-to-end spoken language understanding (SLU) suffers from the long-tail word problem. This paper exploits contextual biasing, a technique to improve the speech recognition of rare words, in end-to-end SLU systems. Specifically, a tree-constrained pointer generator (TCPGen), a powerful and efficient biasing model component, is studied, which leverages a slot shortlist with corresponding entities to extract biasing lists. Meanwhile, to bias the SLU model output slot distribution, a slot probability biasing (SPB) mechanism is proposed to calculate a slot distribution from TCPGen. Experiments on the SLURP dataset showed consistent SLU-F1 improvements using TCPGen and SPB, especially on unseen entities. On a new split by holding out 5 slot types for the test, TCPGen with SPB achieved zero-shot learning with an SLU-F1 score over 50% compared to baselines which can not deal with it. In addition to slot filling, the intent classification accuracy was also improved.
翻译:端到端口语理解( SLU) 存在长尾单词问题。 本文利用了背景偏差, 这是一种在端到端 SLU 系统中改进对稀有单词的语音识别的技术。 具体地说, 正在研究树上受控指点生成器( TCPGen), 这是一种强大而高效的偏差模型组件, 利用一个与相应实体的空档短名单来提取偏差列表。 与此同时, 为了偏向 SLU 模型输出时间档分布, 提议了一个空档概率偏差机制( SPB) 来计算 TCPGen 的空档分布 。 在 SLURP 数据集上进行的实验显示, 使用 TCPGen 和 SPB 的 SLU- F1 进行了一致的 SLU- F1 改进, 特别是在隐形实体上。 在新分割时, TCPGen 与 SPB 进行了5 的零点学习, 与 SLU- F1 超过 50% 的分数比无法处理的基线 。 此外, 目的分类精度准确性也得到了改进。</s>