Query-by-document (QBD) retrieval is an Information Retrieval task in which a seed document acts as the query and the goal is to retrieve related documents -- it is particular common in professional search tasks. In this work we improve the retrieval effectiveness of the BERT re-ranker, proposing an extension to its fine-tuning step to better exploit the context of queries. To this end, we use an additional document-level representation learning objective besides the ranking objective when fine-tuning the BERT re-ranker. Our experiments on two QBD retrieval benchmarks show that the proposed multi-task optimization significantly improves the ranking effectiveness without changing the BERT re-ranker or using additional training samples. In future work, the generalizability of our approach to other retrieval tasks should be further investigated.
翻译:逐个查询文件检索是一项信息检索任务,其中种子文件作为查询,目标是检索相关文件 -- -- 在专业搜索任务中这是特别常见的。在这项工作中,我们提高了BERT重新排序的检索效率,建议延长其微调步骤,以便更好地利用查询的背景。为此,我们除了在微调BERT重新排序时的排序目标之外,还使用一个额外的文件级代表学习目标。我们在两个QBD检索基准上的实验表明,拟议的多任务优化在不改变BERT重新排序或使用更多培训样本的情况下大大提高了排名的有效性。在未来的工作中,我们应该进一步调查我们对待其他检索任务的方法是否普遍适用。