In a sentence, certain words are critical for its semantic. Among them, named entities (NEs) are notoriously challenging for neural models. Despite their importance, their accurate handling has been neglected in speech-to-text (S2T) translation research, and recent work has shown that S2T models perform poorly for locations and notably person names, whose spelling is challenging unless known in advance. In this work, we explore how to leverage dictionaries of NEs known to likely appear in a given context to improve S2T model outputs. Our experiments show that we can reliably detect NEs likely present in an utterance starting from S2T encoder outputs. Indeed, we demonstrate that the current detection quality is sufficient to improve NE accuracy in the translation with a 31% reduction in person name errors.
翻译:在一句话中,某些词对于其语义至关重要。 其中,命名实体(NES)对神经模型提出了臭名昭著的挑战。 尽管这些实体很重要,但它们的准确处理在语音对文本(S2T)翻译研究中被忽略,最近的工作表明S2T模型在位置上表现不佳,特别是个人姓名,其拼写除非事先已知,否则具有挑战性。在这项工作中,我们探索如何利用已知可能出现在特定背景下的NES词典来改进S2T模型输出。我们的实验显示,我们可以可靠地检测出从S2T编码器输出开始的语句中可能存在的NES。 事实上,我们证明目前的检测质量足以提高翻译NE的准确性,减少个人姓名错误31%。</s>