Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model. For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set. However, it is unclear how the choice of these in-context examples and their ordering impacts the output translation quality. In this work, we aim to understand the properties of good in-context examples for MT in both in-domain and out-of-domain settings. We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality. While concatenating multiple random examples reduces the effect of noise, a single good prompt optimized to maximize translation quality on the development dataset can elicit learned information from the pre-trained language model. Adding similar examples based on an n-gram overlap with the test source significantly and consistently improves the translation quality of the outputs, outperforming a strong kNN-MT baseline in 2 out of 4 out-of-domain datasets.
翻译:大型基因变现模型显示,利用文文本学习,执行广泛的自然语言处理任务的能力令人印象深刻,其中有一些例子用来描述模型的任务。对于机器翻译(MT)来说,这些例子通常是从发展数据集随机抽样,其分布与评价数据集相似。然而,不清楚这些文文本实例的选择及其订购如何影响产出翻译质量。在这项工作中,我们的目标是了解在主文和主文外设置中用于中期技术处理的良好文本实例的特性。我们表明,翻译质量和文本中示例的域,以及1个响亮的不相干的例子可能对产出质量产生灾难性影响。在配对多个随机例子的同时,可以减少噪音的影响,一个为最大限度地提高发展数据集翻译质量而迅速优化的单一好例子能够从预先培训的语言模型中获取信息。在正克与测试源重叠的基础上添加类似的例子,并不断提高产出的翻译质量,在4号外的数据中,将一个强大的 kNNMT基线置于一个强大的K-MT基线之外。