The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted an extensive and comprehensive evaluation to demonstrate the effectiveness of the proposed method. Our experimental results on six benchmarks show that combining CoT and self-consistency with PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (91.9%), GSM8K (95.5%) and AQuA (79.9%).
翻译:渐进提示提示法改善大型语言模型的推理能力
Translated Abstract:
本文提出了一种新的提示方法,称为渐进提示提示(PHP),用于大型语言模型(LLMs)中的自动多次交互,通过使用先前生成的答案作为提示来逐步引导用户走向正确答案。研究发现,PHP可以结合CoT和自洽性等最先进的技术,以显著提高LLMs在推理任务中的准确性,同时仍保持高效性。本文对PHP方法进行了广泛和全面的评估。我们在六个基准测试中进行了实验室测试,结果表明,PHP + CoT和自洽性在保持高效性的情况下,显著提高了准确性。例如,在使用贪婪解码方法时,与复杂CoT相比,我们在text-davinci-003上观察到GSM8K的准确度提高了4.2%,在路径样本自洽性上,样本路径的减少率高达46.17%。使用GPT-4和PHP,我们在SVAMP(91.9%),GSM8K(95.5%)和AQuA(79.9%)上实现了最先进的性能。