Simultaneous machine translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available. It is difficult due to limited context and word order difference between languages. Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality. However, the long-distance reordering would make the SimulMT models learn translation mistakenly. Specifically, the model may be forced to predict target tokens when the corresponding source tokens have not been read. This leads to aggressive anticipation during inference, resulting in the hallucination phenomenon. To mitigate this problem, we propose a new framework that decompose the translation process into the monotonic translation step and the reordering step, and we model the latter by the auxiliary sorting network (ASN). The ASN rearranges the hidden states to match the order in the target language, so that the SimulMT model could learn to translate more reasonably. The entire model is optimized end-to-end and does not rely on external aligners or data. During inference, ASN is removed to achieve streaming. Experiments show the proposed framework could outperform previous methods with less latency.
翻译:同时的机器翻译(SimulMT) 开始在源句完全可用之前翻译, 加速翻译过程, 开始翻译过程, 但由于语系和文字顺序差异有限, 很难。 现有的方法会提高 Latenity 或引入对 SimulMT 模型的适应性读写政策, 以便处理本地重新排序并改进翻译质量。 但是, 长距离重新排序会让 SimulMT 模型错误地学习翻译。 具体地说, 模型可能被迫在相应的源符号没有读取时预测目标符号。 这导致在推断过程中产生强烈的预测, 从而导致幻觉现象。 为了缓解这个问题, 我们提议一个新的框架, 将翻译进程分解为单声翻译步骤和重新排序步骤, 并且我们用辅助排序网络( SSN) 来模拟后者。 ASN 重新排列隐藏状态以匹配目标语言的顺序, 以便 SimulMT 模型可以学习更合理的翻译。 整个模型将最终端端端端到端, 并且不依赖外部的匹配器或数据。 在推断过程中, ASNS 将移除前的模型将无法实现流流式框架。 。 。