压缩多语种机器翻译模型会忘记什么? (What Do Compressed Multilingual Machine Translation Models Forget?)

Recently, very large pre-trained models achieve state-of-the-art results in various natural language processing (NLP) tasks, but their size makes it more challenging to apply them in resource-constrained environments. Compression techniques allow to drastically reduce the size of the models and therefore their inference time with negligible impact on top-tier metrics. However, the general performance averaged across multiple tasks and/or languages may hide a drastic performance drop on under-represented features, which could result in the amplification of biases encoded by the models. In this work, we assess the impact of compression methods on Multilingual Neural Machine Translation models (MNMT) for various language groups, gender, and semantic biases by extensive analysis of compressed models on different machine translation benchmarks, i.e. FLORES-101, MT-Gender, and DiBiMT. We show that the performance of under-represented languages drops significantly, while the average BLEU metric only slightly decreases. Interestingly, the removal of noisy memorization with compression leads to a significant improvement for some medium-resource languages. Finally, we demonstrate that compression amplifies intrinsic gender and semantic biases, even in high-resource languages. Code: https://github.com/alirezamshi/bias-compressedMT

翻译：最近,许多经过培训的模型在各种自然语言处理(NLP)任务中取得了最先进的成果,但其规模使得在资源限制的环境中应用这些模型更具挑战性。压缩技术可以大幅缩小模型的规模,从而缩短其推论时间,对顶级指标的影响微不足道。然而,在多种任务和/或语言中,总体表现平均水平可能掩盖代表性不足特点的急剧下降,这可能导致扩大由模型编码的偏见。在这项工作中,我们评估压缩方法对多种语言、性别和语系语言群体多语言神经机器翻译模型的影响。通过广泛分析不同机器翻译基准的压缩模型,例如FLORES-101、MT-Gender和DBIMT。我们表明,代表性不足语言的绩效显著下降,而平均BLEU衡量标准仅略微下降。有趣的是,消除压缩的杂音记忆化导致某些中等资源语言的大幅改进。最后,我们展示了对高语言的压缩模型的内在性别和磁性分析。我们展示了强化了MTA/Qremaimal/Simpressional。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日