This paper describes Charles University submission for Multilingual Low-Resource Translation for Indo-European Languages shared task at WMT21. We competed in translation from Catalan into Romanian, Italian and Occitan. Our systems are based on shared multilingual model. We show that using joint model for multiple similar language pairs improves upon translation quality in each pair. We also demonstrate that chararacter-level bilingual models are competitive for very similar language pairs (Catalan-Occitan) but less so for more distant pairs. We also describe our experiments with multi-task learning, where aside from a textual translation, the models are also trained to perform grapheme-to-phoneme conversion.
翻译:本文介绍查尔斯大学在WMT21为印度-欧洲语言共同任务提交的多语种低资源翻译材料。我们在从加泰罗尼亚翻译成罗马尼亚语、意大利语和奥克西坦语方面竞争。我们的系统以共同的多语种模式为基础。我们显示,使用多种类似语言对口的联合模式可以提高每对语言的翻译质量。我们还表明,字形级双语模式对于非常相似的语言对口(卡塔兰-奥克坦语)而言具有竞争力,而对于更远的对口来说则不那么有竞争力。我们还描述了我们多任务学习的实验,除了文本翻译之外,这些模式还受过培训,可以进行笔式对口语转换。