大型语言模型比你想象的更强大：用于命名实体识别的标签引导上下文学习 (LLMs are Better Than You Think: Label-Guided In-Context Learning for Named Entity Recognition)

In-context learning (ICL) enables large language models (LLMs) to perform new tasks using only a few demonstrations. However, in Named Entity Recognition (NER), existing ICL methods typically rely on task-agnostic semantic similarity for demonstration retrieval, which often yields less relevant examples and leads to inferior results. We introduce DEER, a training-free ICL approach that enables LLMs to make more informed entity predictions through the use of label-grounded statistics. DEER leverages token-level statistics from training labels to identify tokens most informative for entity recognition, enabling entity-focused demonstrations. It further uses these statistics to detect and refine error-prone tokens through a targeted reflection step. Evaluated on five NER datasets across four LLMs, DEER consistently outperforms existing ICL methods and achieves performance comparable to supervised fine-tuning. Further analyses demonstrate that DEER improves example retrieval, remains effective on both seen and unseen entities, and exhibits strong robustness in low-resource settings.

翻译：上下文学习（ICL）使得大型语言模型（LLMs）仅通过少量示例即可执行新任务。然而，在命名实体识别（NER）中，现有的ICL方法通常依赖于任务无关的语义相似性进行示例检索，这往往导致检索到的示例相关性较低，从而产生较差的结果。我们提出了DEER，一种无需训练的ICL方法，通过利用标签基础统计信息，使LLMs能够做出更明智的实体预测。DEER利用训练标签中的词元级统计信息来识别对实体识别最具信息量的词元，从而实现以实体为中心的示例选择。该方法进一步利用这些统计信息，通过有针对性的反思步骤来检测并优化易出错的词元。在四个LLMs和五个NER数据集上的评估表明，DEER始终优于现有的ICL方法，并达到了与有监督微调相当的性能。进一步的分析显示，DEER改进了示例检索，对已见和未见实体均保持有效，并在低资源设置下展现出强大的鲁棒性。