LiR$^3$AG：一种用于检索增强生成的轻量级重排序推理策略框架 (LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation)

Retrieval-Augmented Generation (RAG) effectively enhances Large Language Models (LLMs) by incorporating retrieved external knowledge into the generation process. Reasoning models improve LLM performance in multi-hop QA tasks, which require integrating and reasoning over multiple pieces of evidence across different documents to answer a complex question. However, they often introduce substantial computational costs, including increased token consumption and inference latency. To better understand and mitigate this trade-off, we conduct a comprehensive study of reasoning strategies for reasoning models in RAG multi-hop QA tasks. Our findings reveal that reasoning models adopt structured strategies to integrate retrieved and internal knowledge, primarily following two modes: Context-Grounded Reasoning, which relies directly on retrieved content, and Knowledge-Reconciled Reasoning, which resolves conflicts or gaps using internal knowledge. To this end, we propose a novel Lightweight Rerank Reasoning Strategy Framework for RAG (LiR$^3$AG) to enable non-reasoning models to transfer reasoning strategies by restructuring retrieved evidence into coherent reasoning chains. LiR$^3$AG significantly reduce the average 98% output tokens overhead and 58.6% inferencing time while improving 8B non-reasoning model's F1 performance ranging from 6.2% to 22.5% to surpass the performance of 32B reasoning model in RAG, offering a practical and efficient path forward for RAG systems.

翻译：检索增强生成（RAG）通过将检索到的外部知识整合到生成过程中，有效增强了大型语言模型（LLMs）。推理模型提升了LLMs在多跳问答任务中的性能，这类任务需要整合并推理来自不同文档的多条证据以回答复杂问题。然而，它们通常会带来显著的计算开销，包括增加的令牌消耗和推理延迟。为了更好地理解并缓解这种权衡，我们对RAG多跳问答任务中推理模型的推理策略进行了全面研究。我们的发现表明，推理模型采用结构化策略来整合检索到的知识与内部知识，主要遵循两种模式：基于上下文的推理（直接依赖检索内容）和知识协调推理（利用内部知识解决冲突或填补空白）。为此，我们提出了一种新颖的用于RAG的轻量级重排序推理策略框架（LiR$^3$AG），通过将检索到的证据重组为连贯的推理链，使非推理模型能够迁移推理策略。LiR$^3$AG显著减少了平均98%的输出令牌开销和58.6%的推理时间，同时将8B参数非推理模型的F1性能提升了6.2%至22.5%，使其在RAG中超越了32B参数推理模型的性能，为RAG系统提供了一条实用且高效的前进路径。