We present a non-autoregressive system submission to the WMT 22 Efficient Translation Shared Task. Our system was used by Helcl et al. (2022) in an attempt to provide fair comparison between non-autoregressive and autoregressive models. This submission is an effort to establish solid baselines along with sound evaluation methodology, particularly in terms of measuring the decoding speed. The model itself is a 12-layer Transformer model trained with connectionist temporal classification on knowledge-distilled dataset by a strong autoregressive teacher model.
翻译:我们向WMT 22 高效翻译共享任务提交了一个非自动系统,Helcl 等人(2022年)利用了我们的系统,试图对非自动递减和自动递减模型进行公平比较,以建立坚实的基线和健全的评估方法,特别是在测量解码速度方面;该模型本身是一个12级变换模型,通过强大的自动递减教师模型对知识蒸馏数据集进行连接学时间分类培训。