Multi-source translation (MST), which typically receives multiple source sentences of the same meaning in different languages, has been shown superior to single-source translation. As the quantity of multi-source parallel data is limited, taking full advantage of single-source data and limited multi-source data to make models perform well when receiving as many as possible sources remains a challenge. Unlike previous work mostly devoted to supervised scenarios, we focus on zero-shot MST: expecting models to be able to process unseen combinations of multiple sources, e.g., unseen language combinations, during inference. We propose a simple yet effective parameter efficient method, named Prompt Gating, which appends prompts to the model inputs and attaches gates on the extended hidden states for each encoder layer. It shows strong zero-shot transferability (+9.0 BLEU points maximally) and remarkable compositionality (+15.6 BLEU points maximally) on MST, and also shows its superiorities over baselines on lexically constrained translation.
翻译:由于多源平行数据的数量有限,充分利用单一来源数据和有限的多源数据,使模型在收到尽可能多的来源时运行良好。 与以往主要用于监督设想情况的工作不同,我们侧重于零弹MST:预期模型能够在推论期间处理多种来源的无形组合,例如,看不见的语言组合。我们提出了一个简单而有效的参数高效方法,名为“提示 Ging”,该方法为模型输入提供提示,并在每个编码层的扩展隐藏状态上设置大门。该方法显示出很强的零发可转移性(最大值+9.0 BLEU点)和显著的组成性(最大值+15.6 BLEU点),并且还显示了相对于词汇限制翻译基线的优势。