In this paper, we study the use of deep Transformer translation model for the CCMT 2022 Chinese-Thai low-resource machine translation task. We first explore the experiment settings (including the number of BPE merge operations, dropout probability, embedding size, etc.) for the low-resource scenario with the 6-layer Transformer. Considering that increasing the number of layers also increases the regularization on new model parameters (dropout modules are also introduced when using more layers), we adopt the highest performance setting but increase the depth of the Transformer to 24 layers to obtain improved translation quality. Our work obtains the SOTA performance in the Chinese-to-Thai translation in the constrained evaluation.
翻译:在本文中,我们研究了2022年CCMT中泰低资源机器翻译任务使用深变换器模型的情况。我们首先探索了与六层变换器一起的低资源情景实验设置(包括BPE合并操作的数量、辍学概率、嵌入大小等 ) 。考虑到增加层数也提高了新模型参数的正规化(在使用更多层时也引入了脱机模块 ), 我们采用了最高性能设置,但将变换器的深度提高到24层,以获得更好的翻译质量。 我们的工作获得了在受限评价中中文对泰文翻译的SOTA绩效 。