Named entity recognition is a traditional task in natural language processing. In particular, nested entity recognition receives extensive attention for the widespread existence of the nesting scenario. The latest research migrates the well-established paradigm of set prediction in object detection to cope with entity nesting. However, the manual creation of query vectors, which fail to adapt to the rich semantic information in the context, limits these approaches. An end-to-end entity detection approach with proposer and regressor is presented in this paper to tackle the issues. First, the proposer utilizes the feature pyramid network to generate high-quality entity proposals. Then, the regressor refines the proposals for generating the final prediction. The model adopts encoder-only architecture and thus obtains the advantages of the richness of query semantics, high precision of entity localization, and easiness of model training. Moreover, we introduce the novel spatially modulated attention and progressive refinement for further improvement. Extensive experiments demonstrate that our model achieves advanced performance in flat and nested NER, achieving a new state-of-the-art F1 score of 80.74 on the GENIA dataset and 72.38 on the WeiboNER dataset.
翻译:命名实体识别是自然语言处理中的传统任务,嵌套实体识别因其广泛存在而受到广泛关注。最新的研究将目标检测中已经建立的集合预测范式迁移到了用于处理实体嵌套的情况。然而,手动创建的查询向量不能适应上下文中丰富的语义信息,限制了这些方法。本文提出了一种采用候选生成器和回归器的端到端实体检测方法来解决这些问题。首先,候选生成器使用特征金字塔网络生成高质量的实体候选。然后,回归器对这些候选进行细化,以生成最终的预测。该模型采用仅编码器的架构,从而获得了查询语义丰富、实体定位精度高、模型训练简单等优势。此外,我们引入了新的空间调制注意力和逐步精细化方法来进一步提高性能。广泛的实验表明,我们的模型在扁平和嵌套实体识别方面都取得了先进的性能,在 GENIA 数据集上的 F1 得分达到了 80.74,在 WeiboNER 数据集上的得分为 72.38。