Position encoding (PE), an essential part of self-attention networks (SANs), is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences. However, in cross-lingual scenarios, e.g. machine translation, the PEs of source and target sentences are modeled independently. Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem. In this paper, we augment SANs with \emph{cross-lingual position representations} to model the bilingually aware latent structure for the input sentence. Specifically, we utilize bracketing transduction grammar (BTG)-based reordering information to encourage SANs to learn bilingual diagonal alignments. Experimental results on WMT'14 English$\Rightarrow$German, WAT'17 Japanese$\Rightarrow$English, and WMT'17 Chinese$\Leftrightarrow$English translation tasks demonstrate that our approach significantly and consistently improves translation quality over strong baselines. Extensive analyses confirm that the performance gains come from the cross-lingual information.
翻译:位置编码(PE)是自我注意网络(SANs)的一个基本部分,用于保存自然语言处理任务的单顺序信息,生成输入序列的固定位置指数。但是,在跨语言情况下,例如机器翻译,源和目标句的PE是独立的建模。由于不同语言的字序差异,以跨语言定位关系建模可能有助于SANs解决这一问题。在本文中,我们用跨语言位置表示法加强SANs,作为输入句的双语意识潜在结构的模型。具体地说,我们利用基于括号的语法重新排序信息鼓励SANs学习双语对立。关于WMT'14英语=Rightarrowe,WAT'17日元\Rightrowe $英语和WMT17中文$\Leftrightrowy$英语翻译任务的实验结果表明,我们的方法大大和持续地改进了输入基准的翻译质量。广泛的分析证实,跨语言信息产生了绩效收益。