This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task. We participated in the Swahili-English direction and got the best scareBLEU (25.3) score among all the participants. Our constrained system is based on a pipeline framework, i.e. ASR and NMT. We trained our models with the officially provided ASR and MT datasets. The ASR system is based on the open-sourced tool Kaldi and this work mainly explores how to make the most of the NMT models. To reduce the punctuation errors generated by the ASR model, we employ our previous work SlotRefine to train a punctuation correction model. To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning. For model structure, we tried auto-regressive and non-autoregressive models, respectively. In addition, we proposed two novel pre-train approaches, i.e. \textit{de-noising training} and \textit{bidirectional training} to fully exploit the data. Extensive experiments show that adding the above techniques consistently improves the BLEU scores, and the final submission system outperforms the baseline (Transformer ensemble model trained with the original parallel data) by approximately 10.8 BLEU score, achieving the SOTA performance.
翻译:本文介绍悉尼和JD大学联合提交的IWSLT 2021 低资源语音翻译任务。 我们参与了斯瓦希里-英语方向,并在所有参与者中获得了最佳的 Swahili- Engli- Engli 分数(25.3)。 我们的制约系统基于管道框架, 即 ASR 和 NMT 。 我们用官方提供的 ASR 和 MT 数据集对模型进行了培训。 ASR 系统基于开放源码工具 Kaldi, 这项工作主要探索如何充分利用NMT 模型。 为了减少 ASR 模型生成的平行错误, 我们使用了我们以前的工作 SlotRefine 来训练一个标度校正模型。 为了实现更好的翻译性能, 我们探索了最新的有效战略, 包括背翻译、 知识蒸馏、 多功能重新排位和转动性调整。 对于模型结构, 我们分别尝试了自动递增和非递增模式模型模型模型。 此外,我们提出了两种新的前置方法, 即, 文本{ dede- develinReine 来训练一个校正校正校正的校验模型。