使用轻量轻度精度微调的神经定级模型 (Semi-Siamese Bi-encoder Neural Ranking Model Using Lightweight Fine-Tuning)

A BERT-based Neural Ranking Model (NRM) can be either a cross-encoder or a bi-encoder. Between the two, bi-encoder is highly efficient because all the documents can be pre-processed before the actual query time. Although query and document are independently encoded, the existing bi-encoder NRMs are Siamese models where a single language model is used for consistently encoding both of query and document. In this work, we show two approaches for improving the performance of BERT-based bi-encoders. The first approach is to replace the full fine-tuning step with a lightweight fine-tuning. We examine lightweight fine-tuning methods that are adapter-based, prompt-based, and hybrid of the two. The second approach is to develop semi-Siamese models where queries and documents are handled with a limited amount of difference. The limited difference is realized by learning two lightweight fine-tuning modules, where the main language model of BERT is kept common for both query and document. We provide extensive experiment results for monoBERT, TwinBERT, and ColBERT where three performance metrics are evaluated over Robust04, ClueWeb09b, and MS-MARCO datasets. The results confirm that both lightweight fine-tuning and semi-Siamese are considerably helpful for improving BERT-based bi-encoders. In fact, lightweight fine-tuning is helpful for cross-encoder, too.

翻译：以BERT为基础的神经分级模型(NRM)可以是跨级编码器,也可以是双编码器。在两种方法之间,双编码器非常高效,因为所有文件都可以在实际查询时间之前预处理。虽然查询和文件是独立编码的,但现有的双编码码NRM是Siame模型,其中使用单一语言模型对查询和文件进行一致编码。在这项工作中,我们展示了两种改进基于BERT的有益双编码器业绩的方法。第一个方法是用轻度微调取代整个微调步骤。我们检查了所有文件在实际查询时间之前的预处理。虽然查询和文件是独立编码的,但现有的双编码NRMRM(双码)是Siame模型,其中使用一种单一语言模型对查询和文件进行一致编码。我们提供了两种轻度微调模块,其中BERT的主要语言模型对查询和文件都是常见的。我们为 SIBERT、SUBERT和CROBER(B-B-BER)提供了广泛的试验结果结果结果,其中三度数据是B-B-B-B-B-C-C-S-BS-B-B-C-C-C-C-C-C-C-C-C-SUDRD-S-ID-ID-ID-S-S-S-S-S-S-S-S-D-ID-D-D-D-D-D-D-D-D-BD-BD-BD-C-D-D-C-D-D-D-D-BD-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-BD-D-D-C-BD-BD-D-D-C-D-C-C-C-C-C-C-C-C-BD-D-SD-BD-BD-D-D-D-D-D-S-S-S-B-B-B-S-B-B-B-D-D-D-BD-D-C-D-D-D-D-D-D-D-B

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

预训练语言模型fine-tuning近期进展概述

专知会员服务

40+阅读 · 2021年4月9日

【EMNLP2020】低资源域适应的多阶段预训练

专知会员服务

19+阅读 · 2020年10月13日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日