Modern automatic translation systems aim at place the human at the center by providing contextual support and knowledge. In this context, a critical task is enriching the output with information regarding the mentioned entities, which is currently achieved processing the generated translation with named entity recognition (NER) and entity linking systems. In light of the recent promising results shown by direct speech translation (ST) models and the known weaknesses of cascades (error propagation and additional latency), in this paper we propose multitask models that jointly perform ST and NER, and compare them with a cascade baseline. The experimental results show that our models significantly outperform the cascade on the NER task (by 0.4-1.0 F1), without degradation in terms of translation quality, and with the same computational efficiency of a plain direct ST model.
翻译:现代自动翻译系统的目标是通过提供背景支持和知识,将人置于中心位置。 在这方面,一项关键任务是利用上述实体的信息丰富产出内容,目前通过名称实体识别和实体连接系统处理所生成的翻译,根据最近通过直接语音翻译模型显示的有希望的结果以及已知的级联(快速传播和额外延缓)的弱点,我们在本文件中提出了联合执行ST和NER的多任务模型,并将其与级联基线进行比较。 实验结果显示,我们的模型大大超过NER任务的级联(0.4-1.0 F1),在翻译质量方面没有退化,而且一个简单直接ST模型的计算效率相同。