Pre-trained language models (PLMs) often take advantage of the monolingual and multilingual dataset that is freely available online to acquire general or mixed domain knowledge before deployment into specific tasks. Extra-large PLMs (xLPLMs) are proposed very recently to claim supreme performances over smaller-sized PLMs such as in machine translation (MT) tasks. These xLPLMs include Meta-AI's wmt21-dense-24-wide-en-X (2021) and NLLB (2022). In this work, we examine if xLPLMs are absolutely superior to smaller-sized PLMs in fine-tuning toward domain-specific MTs. We use two different in-domain data of different sizes: commercial automotive in-house data and clinical shared task data from the ClinSpEn2022 challenge at WMT2022. We choose popular Marian Helsinki as smaller sized PLM and two massive-sized Mega-Transformers from Meta-AI as xLPLMs. Our experimental investigation shows that 1) on smaller-sized in-domain commercial automotive data, xLPLM wmt21-dense-24-wide-en-X indeed shows much better evaluation scores using SacreBLEU and hLEPOR metrics than smaller-sized Marian, even though its score increase rate is lower than Marian after fine-tuning; 2) on relatively larger-size well prepared clinical data fine-tuning, the xLPLM NLLB tends to lose its advantage over smaller-sized Marian on two sub-tasks (clinical terms and ontology concepts) using ClinSpEn offered metrics METEOR, COMET, and ROUGE-L, and totally lost to Marian on Task-1 (clinical cases) on all official metrics including SacreBLEU and BLEU; 3) metrics do not always agree with each other on the same tasks using the same model outputs; 4) clinic-Marian ranked No.2 on Task- 1 (via SACREBLEU/BLEU) and Task-3 (via METEOR and ROUGE) among all submissions.
翻译:培训前语言模型(PLMS) 常常利用在线免费的单语和多语言数据集(PLM) 来获取通用或混合域知识。 最近,我们提议使用超大型 PLMS(xLPLM) 来声称比机器翻译(MT) 等小型的PLM(MT) 任务拥有最高性能。 这些xLPLMM(MetA-AI wmt21-dense-24-全X(2021) 和NLLLLB(2022) 。 在这项工作中,我们检查xLPLMM(TLM) 是否绝对优于小型的PLMS(PM) 微小的SLMM(TLM) 微小的SLMLM(MLM) 和小的SMALM(MAL-NM(MA) 最小型的SDRML) 、 更小型的SMALM(S-RML-NL) 和小的S-RV(MAL) 更小型的S- RDO- RDL- RDRV-RV-RV-RV- RD-RV- RV-RV-RV-RV-R-R-RV-R-R-RV-R-R-R-R-R-RV-S-S-RV-S-S-S-S-S-S-RV-S-S-S-S-S-S-S-RV-RV-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-