Pretrained multilingual text encoders based on neural Transformer architectures, such as multilingual BERT (mBERT) and XLM, have achieved strong performance on a myriad of language understanding tasks. Consequently, they have been adopted as a go-to paradigm for multilingual and cross-lingual representation learning and transfer, rendering cross-lingual word embeddings (CLWEs) effectively obsolete. However, questions remain to which extent this finding generalizes 1) to unsupervised settings and 2) for ad-hoc cross-lingual IR (CLIR) tasks. Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs. In contrast to supervised language understanding, our results indicate that for unsupervised document-level CLIR -- a setup with no relevance judgments for IR-specific fine-tuning -- pretrained encoders fail to significantly outperform models based on CLWEs. For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved. However, the peak performance is not met using the general-purpose multilingual text encoders `off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.
翻译:基于神经变异结构,如多语种BERT(mBERT)和XLM(XLM)等经过预先培训的多语言文本编码器,在多种语言理解任务中取得了强有力的成绩,因此被采纳为多语种和跨语言代表性学习和转让的一流范例,使跨语言嵌入(CLWES)实际上已经过时。然而,问题仍然存在,在什么程度上,这一发现将1级概括为1级到不受监督的设置,2级用于跨语言的跨语言的IR(CLWES)任务。因此,在这项工作中,我们提出了一项系统的经验性研究,侧重于最先进的多语种编码器是否适合跨语言文档和句次检索任务。与监督的语言理解相反,我们的结果表明,对于不受监督的文件级CLIR(CLIR)而言,这一设置没有根据CLWES(LWes)进行具体微调 -- 预先培训的编码器无法大大超越模式。关于判决级CLIR(CLIR),我们展示的是,最先进的多语种语言版本的文本无法满足。但是,最高级的版本的版本是能够达到普通版本。