用于文字排级的预先培训的变换器:BERT及以后 (Pretrained Transformers for Text Ranking: BERT and Beyond)

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This survey provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing (NLP), information retrieval (IR), and beyond. In this survey, we provide a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. We cover a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. There are two themes that pervade our survey: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this survey also attempts to prognosticate where the field is heading.

翻译：文本排序的目标是生成一份有顺序的文本列表, 作为对质询的回应。虽然最常用的文本排序提法是搜索, 但在许多自然语言处理应用程序中也可以找到任务实例。本调查概述了神经网络结构的文本排序, 称为变压器, 最著名的例子是BERT。变压器和自我监督的预培训相结合, 导致自然语言处理( NLP) 、信息检索(IR) 等的范式转变。在本次调查中, 我们综合了现有工作, 将其作为一个单一的入门点, 希望更好地了解如何将变压器用于文本排序问题, 以及希望在这一领域开展工作的研究人员。我们涵盖一系列广泛的现代技术, 分为两大类: 在多阶段结构中进行重新排序的变压器模型和直接进行排序的密集的检索技术。有两个主题: 处理长文档的技术, 除了NLP的典型的逐句处理外, 以及处理变压法基础的尝试, 以及处理变压法性结构的精度( ) (i. ) 质量和变压前和变压法的精度的精度,,, 质量和和质量和变压法等等等, 质量,, 和质量, 和质量和和等等等等等等,, 等,,,,,,,,,,,, 和和等, 等,, 等,,,,,,,,,,,,,,, 和和,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 和,,,,,, 和,,,,,,,, 和,,, 和和,,,,,,,, 和和