Zero-shot cross-lingual transfer is an important feature in modern NLP models and architectures to support low-resource languages. In this work, We study zero-shot cross-lingual transfer from English to French and German under Multi-Label Text Classification, where we train a classifier using English training set, and we test using French and German test sets. We extend EURLEX57K dataset, the English dataset for topic classification of legal documents, with French and German official translation. We investigate the effect of using some training techniques, namely Gradual Unfreezing and Language Model finetuning, on the quality of zero-shot cross-lingual transfer. We find that Language model finetuning of multi-lingual pre-trained model (M-DistilBERT, M-BERT) leads to 32.0-34.94%, 76.15-87.54\% relative improvement on French and German test sets correspondingly. Also, Gradual unfreezing of pre-trained model's layers during training results in relative improvement of 38-45% for French and 58-70% for German. Compared to training a model in Joint Training scheme using English, French and German training sets, zero-shot BERT-based classification model reaches 86% of the performance achieved by jointly-trained BERT-based classification model.
翻译:零点跨语言传输是现代低资源语言支持NLP模式和架构的一个重要特征。在这项工作中,我们研究了在多标签文本分类下从英语到法语和德语的零点跨语言传输,我们用英语培训了分类员,我们用法语和德语测试组进行了测试。我们扩展了EURLEX57K数据集,即用于法律文件专题分类的英文数据集,并配有法语和德语官方翻译。我们调查了使用某些培训技术,即逐步解冻和语言模型微调,对零点跨语言转让的质量的影响。我们发现,多语言预培训模式(M-DistillBERT,M-BERT)的语文模型微调导致32.0-34.94 %,76.15-87.54 % 法德测试组相对改进。此外,在培训结果中,预先培训模式的严格不冻结了法语38-45%和德语的58-70 % 。我们发现,在联合培训模式中,通过英语、法语和德国的零水平分类,联合培训模式,通过英语和德国的B级标准,实现了86联合培训模式。