A challenge in spoken language translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we fine-tune a general-purpose, large language model to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We compare to several segmentation strategies and find that our approach improves BLEU score on three languages by an average of 2.7 BLEU overall compared to an automatic punctuation baseline. Further, we demonstrate the effectiveness of two constrained decoding strategies to improve well-formedness of the model output from above 99% to 100%.
翻译:口语翻译方面的一项挑战是,口语内容数量众多是长式的,但获得高质量翻译需要短小的单位。为解决这一不匹配问题,我们微调了一个通用的大语言模式,将长的ASR誊本分成可以独立翻译的部分,以便最大限度地提高总体翻译质量。我们比较了几个分解战略,发现我们的方法使三种语言的BLEU得分总体平均为2.7 BLEU,比自动标定基线高出2.7 BLEU。此外,我们展示了两种有限的解码战略的有效性,将模型产出的完善程度从99%以上提高到100%。