Named Entity Recognition (NER) models capable of Continual Learning (CL) are realistically valuable in areas where entity types continuously increase (e.g., personal assistants). Meanwhile the learning paradigm of NER advances to new patterns such as the span-based methods. However, its potential to CL has not been fully explored. In this paper, we propose SpanKL1, a simple yet effective Span-based model with Knowledge distillation (KD) to preserve memories and multi-Label prediction to prevent conflicts in CL-NER. Unlike prior sequence labeling approaches, the inherently independent modeling in span and entity level with the designed coherent optimization on SpanKL promotes its learning at each incremental step and mitigates the forgetting. Experiments on synthetic CL datasets derived from OntoNotes and Few-NERD show that SpanKL significantly outperforms previous SoTA in many aspects, and obtains the smallest gap from CL to the upper bound revealing its high practiced value.
翻译:能够持续学习的命名实体识别模型(NER)在实体类型不断增加的领域(如个人助理)具有实际价值。与此同时,NER进步的学习范式对新模式如基于跨区域的方法具有学习价值,然而,尚未充分探索其对CL的潜力。在本文件中,我们提议SpanKL1, 这是一种简单而有效的基于SpanKL1的基于知识蒸馏的简单而有效的SpanKL1模型, 以保存记忆和多标签预测, 以防止CL-NER的冲突。 与先前的序列标签方法不同, 与SpanKL设计一致优化的跨区域和实体层面固有的独立模型, 推动其在每一个渐进步骤上的学习并减轻遗忘。 Onto Notes 和 few-NERD 的合成 CL数据集实验显示,SpanKL在许多方面大大超越了以前的SATA, 并获得了从CL到上层最小的距离, 显示其高实践价值。