基于前缀增强的路径一致性方法用于大语言模型高效推理 (Path-Consistency with Prefix Enhancement for Efficient Inference in LLMs)

To enhance the reasoning capabilities of large language models (LLMs), self-consistency has become a popular approach, combining multiple samplings with majority voting. However, current methods are computationally expensive and time-consuming due to the need for numerous samplings. To address this, this paper introduces path-consistency, which leverages the confidence of earlier-generated answers to identify the most promising prefix and guide the generation of subsequent branches. By dynamically guiding the generation of subsequent branches based on this prefix, path-consistency mitigates both the errors and redundancies from random or less useful sampling in self-consistency. This approach reduces errors and redundancies from random sampling, significantly accelerating inference by minimizing token consumption. Our extensive empirical results demonstrate that path-consistency improves inference latency by up to 40.5\%, while maintaining task accuracy across various tasks, including mathematical reasoning, commonsense reasoning, and symbolic reasoning.

翻译：为增强大语言模型（LLMs）的推理能力，自一致性已成为一种主流方法，其通过多次采样结合多数投票机制实现。然而，现有方法因需大量采样而计算开销大、耗时显著。为此，本文提出路径一致性方法，利用早期生成答案的置信度识别最具潜力的前缀，并以此引导后续分支的生成。通过基于此前缀动态引导后续分支生成，路径一致性有效减少了自一致性方法中随机采样或低效采样带来的误差与冗余。该方法通过降低随机采样的误差与冗余，显著减少令牌消耗，从而大幅加速推理过程。大量实验结果表明，路径一致性在数学推理、常识推理和符号推理等多种任务中，能在保持任务精度的同时，将推理延迟最高降低40.5%。