Simultaneous Speech-to-text Translation (SimulST) systems translate source speech in tandem with the speaker using partial input. Recent works have tried to leverage the text translation task to improve the performance of Speech Translation (ST) in the offline domain. Motivated by these improvements, we propose to add Decision Attentive Regularization (DAR) to Monotonic Multihead Attention (MMA) based SimulST systems. DAR improves the read/write decisions for speech using the Simultaneous text Translation (SimulMT) task. We also extend several techniques from the offline domain to the SimulST task. Our proposed system achieves significant performance improvements for the MuST-C English-German (EnDe) SimulST task, where we provide an average BLUE score improvement of around 4.57 points or 34.17% across different latencies. Further, the latency-quality tradeoffs establish that the proposed model achieves better results compared to the baseline.
翻译:同时的语音对文本翻译系统(SimulST)与使用部分投入的发言者一起翻译源语言。最近的工作试图利用文本翻译任务来提高发言翻译在离线域的性能。在这些改进的推动下,我们提议在以单调多头注意(MMA)为基础的SimulST系统中增加“决定强化常规化(DAR) ” 。DAR改进了使用同声文本翻译(SimulMT)任务对演讲的读写决定。我们还将一些技术从离线域扩大到SimulST任务。我们提议的系统在 MuST-C英文-德文(Ende)SimulST任务中取得了显著的绩效改进,我们提供了在不同延迟时间中平均4.57点或34.17%的BLUE分。此外,LE质量权衡确定,拟议的模型比基线取得更好的效果。