Compared to other language tasks, applying pre-trained language models (PLMs) for search ranking often requires more nuances and training signals. In this paper, we identify and study the two mismatches between pre-training and ranking fine-tuning: the training schema gap regarding the differences in training objectives and model architectures, and the task knowledge gap considering the discrepancy between the knowledge needed in ranking and that learned during pre-training. To mitigate these gaps, we propose Pre-trained, Prompt-learned and Pre-finetuned Neural Ranker (P^3 Ranker). P^3 Ranker leverages prompt-based learning to convert the ranking task into a pre-training like schema and uses pre-finetuning to initialize the model on intermediate supervised tasks. Experiments on MS MARCO and Robust04 show the superior performances of P^3 Ranker in few-shot ranking. Analyses reveal that P^3 Ranker is able to better accustom to the ranking task through prompt-based learning and retrieve necessary ranking-oriented knowledge gleaned in pre-finetuning, resulting in data-efficient PLM adaptation. Our code is available at https://github.com/NEUIR/P3Ranker.
翻译:与其他语言任务相比,应用预先培训的语言模型(PLM)进行搜索排名往往需要更多的细化和训练信号。在本文件中,我们确定并研究培训前和排名微调之间的两种不匹配之处:培训目标和模式架构差异的培训模式差距,以及考虑到排名所需知识与培训前知识之间的差异的任务知识差距。为了缩小这些差距,我们提议采用预先培训、快速学习和预成型神经定级器(P3 Ranger),P3级调级器迅速利用学习优势,将排序任务转换为预培训任务,如Schema,并使用预整装调整,以启动中间监督任务模式。关于MS MARCO和Robust04的实验显示P3级定级器在几分级中的优异性表现。分析显示,P3级定级器能够通过迅速学习和检索在预成型调级过程中必要的分级知识,从而实现数据高效的PLMM/PMR3的调整。我们的代码可在 https/comgi/ARC/AUR3中找到。