ToMMeR——基于大型语言模型的高效实体提及检测 (ToMMeR -- Efficient Entity Mention Detection from Large Language Models)

Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93\% recall zero-shot, with over 90\% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75\%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87\% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.

翻译：识别文本中哪些片段指向实体——即提及检测——既是信息提取的基础任务，也是一个已知的性能瓶颈。我们提出了ToMMeR，一种轻量级模型（参数<30万），用于探测早期LLM层中的提及检测能力。在13个NER基准测试中，ToMMeR实现了93%的零样本召回率，并利用LLM作为评判器获得了超过90%的精确率，这表明尽管召回率很高，ToMMeR极少产生虚假预测。跨模型分析表明，不同架构的模型（参数范围1400万至150亿）在提及边界上趋于一致（DICE系数>75%），证实了提及检测能力自然涌现于语言建模过程。当扩展以包含跨度分类头时，ToMMeR在标准基准测试上达到了接近最优的NER性能（F1分数80-87%）。我们的工作证明，结构化实体表示存在于Transformer早期层中，并可通过极少的参数高效恢复。