Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e.g., task-oriented dialog and question answering. However, applying LLMs to real-world, mission-critical applications remains challenging mainly due to their tendency to generate hallucinations and inability to use external knowledge.This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules. Our system makes the LLM generate responses grounded in consolidated external knowledge, e.g., stored in task-specific databases. It also iteratively revises LLM prompts to improve model responses using feedback generated by utility functions, e.g., the factuality score of a LLM-generated response. The effectiveness of LLM-Augmenter is empirically validated on two types of mission-critical scenarios, task-oriented dialog and open-domain question answering. LLM-Augmenter significantly reduces ChatGPT's hallucinations without sacrificing the fluency and informativeness of its responses. We make the source code and models publicly available.
翻译:大型语言模型(LLMS),如ChatGPT等大型语言模型(LLMS)能够为许多下游任务产生人性化、流畅的响应,例如任务导向式对话和问答。然而,将LLMS应用到现实世界、任务关键应用中仍然具有挑战性,主要是因为它们倾向于产生幻觉和无法使用外部知识。本文件提议了一个LLM-Augmenter系统,用一套插接和游戏模块来增强黑盒LLM。我们的系统使LLM产生基于综合外部知识的响应,例如储存在具体任务数据库中。它还反复修订LLM,利用由公用事业功能产生的反馈来改进模型响应,例如LMM产生的响应的质量分数。LLM-Auger的效力在两种任务关键情景、任务导向式对话和公开回答上得到了经验上的验证。LLMM-Augmenter在不牺牲其反应的流利性和信息性的情况下大大降低了CtGPT的幻觉。我们公开提供源码和模型。</s>