This paper presents a method for estimating the hallucination rate for in-context learning (ICL) with generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and a prediction question and asked to generate a response. One interpretation of ICL assumes that the CGM computes the posterior predictive of an unknown Bayesian model, which implicitly defines a joint distribution over observable datasets and latent mechanisms. This joint distribution factorizes into two components: the model prior over mechanisms and the model likelihood of datasets given a mechanism. With this perspective, we define a hallucination as a generated response to the prediction question with low model likelihood given the mechanism. We develop a new method that takes an ICL problem and estimates the probability that a CGM will generate a hallucination. Our method only requires generating prediction questions and responses from the CGM and evaluating its response log probability. We empirically evaluate our method using large language models for synthetic regression and natural language ICL tasks.
翻译:本文提出了一种用于估计生成式人工智能在上下文学习(ICL)中幻觉率的方法。在ICL中,条件生成模型(CGM)通过数据集和预测问题进行提示,并要求生成响应。对ICL的一种解释假设CGM计算了未知贝叶斯模型的后验预测分布,该模型隐式定义了可观测数据集与潜在机制之间的联合分布。该联合分布可分解为两个组成部分:机制上的模型先验分布,以及给定机制下数据集的模型似然函数。基于这一视角,我们将幻觉定义为在给定机制下具有低模型似然值的预测问题生成响应。我们开发了一种新方法,该方法针对ICL问题,能够估计CGM产生幻觉的概率。我们的方法仅需从CGM生成预测问题及响应,并评估其响应对数概率。我们通过使用大语言模型在合成回归和自然语言ICL任务上进行了实证评估。