Large language models (LLMs) have revolutionized natural language processing (NLP), particularly through Retrieval-Augmented Generation (RAG), which enhances LLM capabilities by integrating external knowledge. However, traditional RAG systems face critical limitations, including disrupted contextual integrity due to text chunking, and over-reliance on semantic similarity for retrieval. To address these issues, we propose CausalRAG, a novel framework that incorporates causal graphs into the retrieval process. By constructing and tracing causal relationships, CausalRAG preserves contextual continuity and improves retrieval precision, leading to more accurate and interpretable responses. We evaluate CausalRAG against regular RAG and graph-based RAG approaches, demonstrating its superiority across several metrics. Our findings suggest that grounding retrieval in causal reasoning provides a promising approach to knowledge-intensive tasks.
翻译:大型语言模型(LLMs)已彻底变革了自然语言处理(NLP),特别是通过检索增强生成(RAG)技术,该技术通过整合外部知识增强了LLM的能力。然而,传统RAG系统面临关键局限,包括因文本分块导致的上下文完整性破坏,以及检索过程对语义相似度的过度依赖。为解决这些问题,我们提出了CausalRAG,一种将因果图整合到检索过程中的新型框架。通过构建并追踪因果关系,CausalRAG保持了上下文的连续性并提高了检索精度,从而生成更准确且可解释的响应。我们通过对比常规RAG及基于图的RAG方法对CausalRAG进行评估,结果表明其在多项指标上均具有优越性。我们的研究结果表明,将检索建立在因果推理之上,为知识密集型任务提供了一种前景广阔的方法。