基于列表重排序的语料库反馈研究 (On Listwise Reranking for Corpus Feedback)

Reranker improves retrieval performance by capturing document interactions. At one extreme, graph-aware adaptive retrieval (GAR) represents an information-rich regime, requiring a pre-computed document similarity graph in reranking. However, as such graphs are often unavailable, or incur quadratic memory costs even when available, graph-free rerankers leverage large language model (LLM) calls to achieve competitive performance. We introduce L2G, a novel framework that implicitly induces document graphs from listwise reranker logs. By converting reranker signals into a graph structure, L2G enables scalable graph-based retrieval without the overhead of explicit graph computation. Results on the TREC-DL and BEIR subset show that L2G matches the effectiveness of oracle-based graph methods, while incurring zero additional LLM calls.

翻译：重排序器通过捕捉文档间的交互作用来提升检索性能。在图感知自适应检索（GAR）这一极端情况下，其代表了一种信息丰富的机制，需要在重排序过程中使用预计算的文档相似度图。然而，由于此类图通常难以获取，或即使可获取也会产生二次内存开销，无图重排序器通过调用大语言模型（LLM）来实现具有竞争力的性能。本文提出L2G，一种从列表重排序器日志中隐式推导文档图的新型框架。通过将重排序信号转换为图结构，L2G实现了可扩展的基于图的检索，而无需显式图计算的开销。在TREC-DL和BEIR子集上的实验结果表明，L2G在达到基于预知图方法同等效果的同时，无需额外调用LLM。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【NAACL2022】自然语言处理的对比数据与学习

专知会员服务

46+阅读 · 2022年7月10日

【ACL2022】一个用于远距监督关系抽取的层级对比学习框架, HiCLRE: A Hierarchical Contrastive Learning Framework for Distantly Supervised Relation Extraction

专知会员服务

15+阅读 · 2022年3月24日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

语义相似性算法演化论文，29页pdf，Evolution of Semantic Similarity - A Survey

专知会员服务

44+阅读 · 2020年4月30日