2019年WMT的CUNI无监督新闻翻译任务系统 (CUNI Systems for the Unsupervised News Translation Task in WMT 2019)

In this paper we describe the CUNI translation system used for the unsupervised news shared task of the ACL 2019 Fourth Conference on Machine Translation (WMT19). We follow the strategy of Artexte et al. (2018b), creating a seed phrase-based system where the phrase table is initialized from cross-lingual embedding mappings trained on monolingual data, followed by a neural machine translation system trained on synthetic parallel data. The synthetic corpus was produced from a monolingual corpus by a tuned PBMT model refined through iterative back-translation. We further focus on the handling of named entities, i.e. the part of vocabulary where the cross-lingual embedding mapping suffers most. Our system reaches a BLEU score of 15.3 on the German-Czech WMT19 shared task.

翻译：在本文中,我们描述了用于ACL 2019年第四次机器翻译会议(WMT19)未受监督的新闻共享任务的CUNI翻译系统,我们遵循Artexte等人(2018b)的战略,建立了一个基于种子语句的系统,根据经过单一语言数据培训的跨语言嵌入图绘制词组表,然后是经过合成平行数据培训的神经机器翻译系统。合成物质是由经调制的PBMT模型通过迭接回翻译精炼的单一语言材料制作的。我们进一步侧重于处理被命名的实体,即跨语言嵌入图最受影响的词汇部分。我们的系统在德国-捷克WMT19共同任务上达到了15.3的BLEU分数。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

【伯克利】黑盒机器翻译系统的模仿攻击与防御，Imitation Attacks and Defenses for Black-box Machine Translation Systems

专知会员服务

8+阅读 · 2020年5月4日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日