Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans. For example, GPT-3 would mistakenly interpret "What word is similar to good?" to mean a homophone, while the user intended a synonym. Our goal is to effectively correct such errors via user interactions with the system but without retraining, which will be prohibitively costly. We pair GPT-3 with a growing memory of recorded cases where the model misunderstood the user's intents, along with user feedback for clarification. Such a memory allows our system to produce enhanced prompts for any new query based on the user feedback for error correction on similar cases in the past. On four tasks (two lexical tasks, two advanced ethical reasoning tasks), we show how a (simulated) user can interactively teach a deployed GPT-3, substantially increasing its accuracy over the queries with different kinds of misunderstandings by the GPT-3. Our approach is a step towards the low-cost utility enhancement for very large pre-trained LMs. Code, data, and instructions to implement MEMPROMPT for a new task at https://www.memprompt.com/.
翻译:GPT-3等大型LMs非常强大,但可以对人类造成明显的错误。例如,GPT-3会错误地将“什么词与好”解释为同性话,而用户则打算使用同义词。我们的目标是通过用户与系统的互动来有效地纠正这种错误,但不进行再培训,这样做的代价太高。我们把GPT-3和越来越多的记录案例的记忆结合起来,而模型会误解用户的意图,同时用户的反馈要求澄清。这样的记忆使我们的系统能够根据用户对过去类似案例错误更正的反馈,为任何新的查询提供强化的提示。关于四项任务(两个词汇任务,两个先进的道德推理任务),我们展示了(模拟的)用户如何以不同种类的误解的方式对已部署的GPT-3进行互动教学。我们的方法是朝着在以下网址执行MEMPROMPT的新任务:https://www.mamprompt.com/。