Translated Title: (Progressive-Hint Prompting Improves Reasoning in Large Language Models)

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted an extensive and comprehensive evaluation to demonstrate the effectiveness of the proposed method. Our experimental results on six benchmarks show that combining CoT and self-consistency with PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (91.9%), GSM8K (95.5%) and AQuA (79.9%).

翻译：渐进提示提示法改善大型语言模型的推理能力 Translated Abstract: 本文提出了一种新的提示方法，称为渐进提示提示（PHP），用于大型语言模型（LLMs）中的自动多次交互，通过使用先前生成的答案作为提示来逐步引导用户走向正确答案。研究发现，PHP可以结合CoT和自洽性等最先进的技术，以显著提高LLMs在推理任务中的准确性，同时仍保持高效性。本文对PHP方法进行了广泛和全面的评估。我们在六个基准测试中进行了实验室测试，结果表明，PHP + CoT和自洽性在保持高效性的情况下，显著提高了准确性。例如，在使用贪婪解码方法时，与复杂CoT相比，我们在text-davinci-003上观察到GSM8K的准确度提高了4.2％，在路径样本自洽性上，样本路径的减少率高达46.17％。使用GPT-4和PHP，我们在SVAMP（91.9％），GSM8K（95.5％）和AQuA（79.9％）上实现了最先进的性能。

相关内容

PHP

关注 296

PHP 是英文超级文本预处理语言（PHP：Hypertext Preprocessor）的缩写。PHP 是一种 HTML 内嵌式的语言，是一种在服务器端执行的嵌入 HTML 文档的脚本语言，语言的风格有类似于 C 语言，被广泛的运用。PHP 具有非常强大的功能，所有的 CGI 的功能 PHP 都能实现，而且支持几乎所有流行的数据库以及操作系统。

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

专知会员服务

104+阅读 · 2023年4月28日

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

【AAMAS2021】机器推理可解释，152页ppt，Machine Reasoning Explainability

专知会员服务

36+阅读 · 2021年5月9日