In this work, we induce character-level noise in various forms when fine-tuning BERT to enable zero-shot cross-lingual transfer to unseen dialects and languages. We fine-tune BERT on three sentence-level classification tasks and evaluate our approach on an assortment of unseen dialects and languages. We find that character-level noise can be an extremely effective agent of cross-lingual transfer under certain conditions, while it is not as helpful in others. Specifically, we explore these differences in terms of the nature of the task and the relationships between source and target languages, finding that introduction of character-level noise during fine-tuning is particularly helpful when a task draws on surface level cues and the source-target cross-lingual pair has a relatively high lexical overlap with shorter (i.e., less meaningful) unseen tokens on average.
翻译:在这项工作中,我们在微调BERT时引入各种形式的字符级噪声,以实现跨语言转移到未见过的方言和语言。我们在三个句子级分类任务上微调BERT,并在各种未见过的方言和语言上评估我们的方法。我们发现,在特定条件下,字符级噪声可以是跨语言转移的极其有效的手段,而在其他场景下则不是很有用。具体而言,我们根据任务的性质和源语言和目标语言之间的关系,探讨了这些差异,并发现在任务利用表面级线索、源-目标跨语言配对与较短的(即含义不够丰富)未见过的标记具有较高的词汇重叠时,在微调过程中引入字符级噪声特别有帮助。