Compared to other language tasks, applying pre-trained language models (PLMs) for search ranking often requires more nuances and training signals. In this paper, we identify and study the two mismatches between pre-training and ranking fine-tuning: the training schema gap regarding the differences in training objectives and model architectures, and the task knowledge gap considering the discrepancy between the knowledge needed in ranking and that learned during pre-training. To mitigate these gaps, we propose Pre-trained, Prompt-learned and Pre-finetuned Neural Ranker (P$^3$ Ranker). P$^3$ Ranker leverages prompt-based learning to convert the ranking task into a pre-training like schema and uses pre-finetuning to initialize the model on intermediate supervised tasks. Experiments on MS MARCO and Robust04 show the superior performances of P$^3$ Ranker in few-shot ranking. Analyses reveal that P$^3$ Ranker is able to better accustom to the ranking task through prompt-based learning and retrieve necessary ranking-oriented knowledge gleaned in pre-finetuning, resulting in data-efficient PLM adaptation. Our code is available at \url{https://github.com/NEUIR/P3Ranker}.
翻译:与其他语言任务相比,应用预先培训的语言模型(PLM)进行搜索排名往往需要更多的细微差别和培训信号。在本文件中,我们确定并研究培训前和排名微调之间的两种不匹配之处:培训目标和模式架构差异的培训计划差距,以及考虑到排名所需知识与培训前知识之间的差异的任务知识差距。为了缩小这些差距,我们提议采用预先培训、迅速学习和事先调整的Neuror Ranger(P$3$ Ranger) 。P$3CRer利用快速学习的杠杆,将排序任务转换成预培训前任务,如Schema,并使用预调整来启动中期监督任务模式。关于MS MARCO和Robust04的实验显示了低调P$3的优异性表现。分析显示,PN3$PNCER能够通过快速学习和检索在PIurth3/NKER校前调整中必要的排序导向知识,从而在数据效率/PLMRMRQ}我们的数据/PLAWADRADRDR/CRQ 中可以使用的数据节调制。