Large Language Models (LLMs) excel at question answering (QA) but often generate hallucinations, including factual errors or fabricated content. Detecting hallucinations from internal uncertainty signals is attractive due to its scalability and independence from external resources. Existing methods often aim to accurately capture a single type of uncertainty while overlooking the complementarity among different sources, particularly between token-level probability uncertainty and the uncertainty conveyed by internal semantic representations, which provide complementary views on model reliability. We present \textbf{HaluNet}, a lightweight and trainable neural framework that integrates multi granular token level uncertainties by combining semantic embeddings with probabilistic confidence and distributional uncertainty. Its multi branch architecture adaptively fuses what the model knows with the uncertainty expressed in its outputs, enabling efficient one pass hallucination detection. Experiments on SQuAD, TriviaQA, and Natural Questions show that HaluNet delivers strong detection performance and favorable computational efficiency, with or without access to context, highlighting its potential for real time hallucination detection in LLM based QA systems.
翻译:大型语言模型(LLM)在问答(QA)任务中表现出色,但常常产生幻觉,包括事实性错误或捏造内容。基于内部不确定性信号检测幻觉因其可扩展性和对外部资源的独立性而备受关注。现有方法通常旨在准确捕捉单一类型的不确定性,而忽视了不同来源之间的互补性,特别是词元级概率不确定性与内部语义表征所传达的不确定性之间的互补关系,这两者为模型可靠性提供了互补的视角。我们提出\textbf{HaluNet},一种轻量级可训练的神经框架,通过将语义嵌入与概率置信度及分布不确定性相结合,实现了多粒度词元级不确定性的整合。其多分支架构自适应地融合模型已知信息与其输出所表达的不确定性,从而实现高效的单次幻觉检测。在SQuAD、TriviaQA和Natural Questions数据集上的实验表明,无论是否访问上下文,HaluNet均展现出强大的检测性能和优越的计算效率,凸显了其在基于LLM的问答系统中实现实时幻觉检测的潜力。