GPT 正变成图灵机：这是一些编程它的方法 (GPT is becoming a Turing machine: Here are some ways to program it)

We demonstrate that, through appropriate prompting, GPT-3 family of models can be triggered to perform iterative behaviours necessary to execute (rather than just write or recall) programs that involve loops, including several popular algorithms found in computer science curricula or software developer interviews. We trigger execution and description of Iterations by Regimenting Self-Attention (IRSA) in one (or a combination) of three ways: 1) Using strong repetitive structure in an example of an execution path of a target program for one particular input, 2) Prompting with fragments of execution paths, and 3) Explicitly forbidding (skipping) self-attention to parts of the generated text. On a dynamic program execution, IRSA leads to larger accuracy gains than replacing the model with the much more powerful GPT-4. IRSA has promising applications in education, as the prompts and responses resemble student assignments in data structures and algorithms classes. Our findings hold implications for evaluating LLMs, which typically target the in-context learning: We show that prompts that may not even cover one full task example can trigger algorithmic behaviour, allowing solving problems previously thought of as hard for LLMs, such as logical puzzles. Consequently, prompt design plays an even more critical role in LLM performance than previously recognized.

翻译：我们证明，通过适当的提示，GPT-3 系列模型可以被触发执行（而不仅仅是写入或回忆）涉及循环的程序，包括计算机科学课程或软件开发人员面试中的几个常见算法。我们通过一种或三种组合的方式触发迭代行为，以执行重复行为：通过管理自注意力 (IRSA) 1) 在一个特定输入的目标程序的执行路径示例中使用强烈的重复结构， 2) 提示执行路径的片段，以及 3) 明确禁止 (跳过) 生成的文本的某些部分的自注意力。在动态程序执行中，IRSA 的准确度提升要比将模型替换为更强大的 GPT-4 更高，IRSA 在教育领域具有很有前途的应用，因为提示和响应类似于数据结构和算法课程中的学生作业。我们的发现具有评估 LLMs 的意义，这些模型通常针对上下文学习：我们表明，甚至可能不覆盖一个完整任务示例的提示可以触发算法行为，从而允许解决先前认为 LLMs 很难解决的问题，例如逻辑难题。因此，与以前认为的相比，提示设计在 LLMs 的性能评估中扮演了更为关键的角色。

相关内容

自注意力

关注 13

利用注意力机制来“动态”地生成不同连接的权重，这就是自注意力模型（Self-Attention Model）. 注意力机制模仿了生物观察行为的内部过程，即一种将内部经验和外部感觉对齐从而增加部分区域的观察精细度的机制。注意力机制可以快速提取稀疏数据的重要特征，因而被广泛用于自然语言处理任务，特别是机器翻译。而自注意力机制是注意力机制的改进，其减少了对外部信息的依赖，更擅长捕捉数据或特征的内部相关性

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

258页简单学算法！《grokking算法图解指南》，grokking algorithms: An illustrated guide for programmers and other curious people

专知会员服务

44+阅读 · 2022年4月5日

【经典书】量化金融导论，192页pdf，哈佛大学Stephen Blyth著作

专知会员服务

97+阅读 · 2022年4月3日