Many recent perturbation studies have found unintuitive results on what does and does not matter when performing Natural Language Understanding (NLU) tasks in English. Coding properties, such as the order of words, can often be removed through shuffling without impacting downstream performances. Such insight may be used to direct future research into English NLP models. As many improvements in multilingual settings consist of wholesale adaptation of English approaches, it is important to verify whether those studies replicate or not in multilingual settings. In this work, we replicate a study on the importance of local structure, and the relative unimportance of global structure, in a multilingual setting. We find that the phenomenon observed on the English language broadly translates to over 120 languages, with a few caveats.
翻译:最近的许多扰动研究发现,在用英语执行自然语言理解(NLU)任务时,什么是无关紧要,什么是无关紧要的,结果不直观。语言顺序等编码特性往往可以通过冲洗而去掉,而不会影响下游的表演。这种洞察力可用于指导今后对英语NLP模型的研究。由于多语种环境中的许多改进包括大量改编英语方法,因此必须核实这些研究是否在多语种环境中复制。在这项工作中,我们在多语种环境中复制了关于地方结构的重要性和全球结构相对无关紧要的研究。我们发现,在英语上观察到的现象广泛翻译成120多种语言,并附有一些警告。