Consistency is a key requirement of high-quality translation. It is especially important to adhere to pre-approved terminology and adapt to corrected translations in domain-specific projects. Machine translation (MT) has achieved significant progress in the area of domain adaptation. However, real-time adaptation remains challenging. Large-scale language models (LLMs) have recently shown interesting capabilities of in-context learning, where they learn to replicate certain input-output text generation patterns, without further fine-tuning. By feeding an LLM at inference time with a prompt that consists of a list of translation pairs, it can then simulate the domain and style characteristics. This work aims to investigate how we can utilize in-context learning to improve real-time adaptive MT. Our extensive experiments show promising results at translation time. For example, GPT-3.5 can adapt to a set of in-domain sentence pairs and/or terminology while translating a new sentence. We observe that the translation quality with few-shot in-context learning can surpass that of strong encoder-decoder MT systems, especially for high-resource languages. Moreover, we investigate whether we can combine MT from strong encoder-decoder models with fuzzy matches, which can further improve translation quality, especially for less supported languages. We conduct our experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES).
翻译:一致性是高质量翻译的关键要求之一。 遵守预先批准的术语并适应特定领域项目更正的翻译尤为重要。 机器翻译(MT)在领域适应领域取得了显著进展。 然而,实时适应仍然具有挑战性。 大规模语言模型(LLMS)最近展示了内文学习的有趣能力,它们学习复制某些输入-产出文本生成模式,而无需进一步微调。 通过在推论时间将一个包含翻译配对清单的提示输入LLM,它可以模拟域名和风格特征。 这项工作旨在调查我们如何利用文本学习来改进实时适应性MT。 我们的广泛实验显示在翻译时间有希望的结果。 例如,GPT-3-3能够适应一系列内文句配对和/或术语,同时翻译一个新句子。 我们观察到,通过微量的英译文学习(EN-RW-deder-decoder MT)系统,特别是高资源语言,可以超越强的英译码-英译(我们能否将英译制和英译制更强的英译)。 此外,我们研究我们能否将英译为更强的英译。</s>