Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces. For contextualized embeddings, alignment becomes more complex as we additionally take context into consideration. In this work, we propose using Optimal Transport (OT) as an alignment objective during fine-tuning to further improve multilingual contextualized representations for downstream cross-lingual transfer. This approach does not require word-alignment pairs prior to fine-tuning that may lead to sub-optimal matching and instead learns the word alignments within context in an unsupervised manner. It also allows different types of mappings due to soft matching between source and target sentences. We benchmark our proposed method on two tasks (XNLI and XQuAD) and achieve improvements over baselines as well as competitive results compared to similar recent works.
翻译:最近的研究提出了在背景环境中改进多语种文字表达方式的不同方法,包括将源与目标嵌入空间相匹配的技术。对于背景化嵌入空间,随着我们进一步考虑到背景因素,对齐变得更加复杂。在这项工作中,我们提议在微调时将最佳交通(OT)作为调整目标,以进一步改进下游跨语言传输的多语种背景表达方式。这一方法不需要在微调之前对字型对齐,从而可能导致次优化匹配,而是以不受监督的方式学习上下文中的字型对齐。它还允许由于源与目标句之间的软匹配而进行不同种类的绘图。我们以两项任务(XLI和XQUAD)作为我们拟议方法的基准,在基线上实现改进,并与最近的类似工程相比取得竞争性结果。