Neural Chat Translation (NCT) aims to translate conversational text into different languages. Existing methods mainly focus on modeling the bilingual dialogue characteristics (e.g., coherence) to improve chat translation via multi-task learning on small-scale chat translation data. Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. Specifically, we devise a three-stage training framework to incorporate the large-scale in-domain chat translation data into training by adding a second pre-training stage between the original pre-training and fine-tuning stages. Further, we investigate where and how to schedule the dialogue-related auxiliary tasks in multiple training stages to effectively enhance the main chat translation task. Extensive experiments in four language directions (English-Chinese and English-German) verify the effectiveness and superiority of the proposed approach. Additionally, we have made the large-scale in-domain paired bilingual dialogue dataset publicly available to the research community.
翻译:现有方法主要侧重于模拟双语对话特点(如一致性),以便通过对小型聊天翻译数据进行多任务学习来改进聊天翻译。虽然国家交流模式取得了令人印象深刻的成功,但由于聊天翻译数据和简单的联合培训方式不足,仍然远远不能令人满意。为了解决上述问题,我们建议为国家交流网络制定一个安排的多任务学习框架。具体地说,我们设计了一个三阶段培训框架,通过在最初的培训前阶段和微调阶段之间增加第二个培训前阶段,将大型日常聊天翻译数据纳入培训。此外,我们调查如何在多个培训阶段安排与对话有关的辅助任务,以有效加强主要聊天翻译任务。在四种语言方向(英语-中文本和英语-德语)进行的广泛实验,以核实拟议方法的有效性和优越性。此外,我们向研究界公开了大规模对口双语对话数据集。