Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate facts and make non-factual statements which can undermine trust in their output. Existing fact-checking approaches either require access to token-level output probability distribution (which may not be available for systems such as ChatGPT) or external databases that are interfaced via separate, often complex, modules. In this work, we propose "SelfCheckGPT", a simple sampling-based approach that can be used to fact-check black-box models in a zero-resource fashion, i.e. without an external database. SelfCheckGPT leverages the simple idea that if a LLM has knowledge of a given concept, sampled responses are likely to be similar and contain consistent facts. However, for hallucinated facts, stochastically sampled responses are likely to diverge and contradict one another. We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset, and manually annotate the factuality of the generated passages. We demonstrate that SelfCheckGPT can: i) detect non-factual and factual sentences; and ii) rank passages in terms of factuality. We compare our approach to several existing baselines and show that in sentence hallucination detection, our approach has AUC-PR scores comparable to grey-box methods, while SelfCheckGPT is best at passage factuality assessment.
翻译:GPT-3等大型语言模型(LLMS)的生成能够产生对多种用户的迅速反应的高度流畅的响应,然而,LLMS已知的LMS能够对大量用户的迅速反应产生高度流畅的反应,但是,LLMS可以对事实产生幻觉,并作出非事实性的陈述,从而破坏对输出的信任。现有的事实检查方法要么需要访问象征性的输出概率分布(对于象ChatGPT这样的系统来说可能没有这种机会),要么通过不同的、往往是复杂的模块进行互动的外部数据库。在这项工作中,我们建议“自封GPTT-3”是一种简单的基于抽样的方法,可以用来以零资源的方式对黑箱模型进行事实检查,也就是说,在没有外部数据库的情况下,进行自我检查。Self CheckGPT利用一个简单的想法,即如果LM知道一个特定的概念,抽样反应可能相似,并且包含一致的事实性分布。然而,对于令人信服的抽样反应很可能是不同的。我们用GPT-3方法来调查这个方法,从WikiB数据集和手动性标准质量评估个人的情况,我们通过时,我们可以对事实性标准性判断性判断性判断性判断性判断性判分数。</s>