Extending Named Entity Recognition (NER) models to new PII entities in noisy spoken-language data is a common need. We find that jointly fine-tuning a BERT model on standard semantic entities (PER, LOC, ORG) and new pattern-based PII (EMAIL, PHONE) results in minimal degradation for original classes. We investigate this "peaceful coexistence," hypothesizing that the model uses independent semantic vs. morphological feature mechanisms. Using an incremental learning setup as a diagnostic tool, we measure semantic drift and find two key insights. First, the LOC (location) entity is uniquely vulnerable due to a representation overlap with new PII, as it shares pattern-like features (e.g., postal codes). Second, we identify a "reverse O-tag representation drift." The model, initially trained to map PII patterns to 'O', blocks new learning. This is resolved only by unfreezing the 'O' tag's classifier, allowing the background class to adapt and "release" these patterns. This work provides a mechanistic diagnosis of NER model adaptation, highlighting feature independence, representation overlap, and 'O' tag plasticity. Work done based on data gathered by https://www.papernest.com
翻译:将命名实体识别(NER)模型扩展至嘈杂口语数据中的新型个人身份信息(PII)实体是一项常见需求。我们发现,在标准语义实体(PER、LOC、ORG)与新型基于模式的PII实体(EMAIL、PHONE)上联合微调BERT模型,对原始类别的性能影响极小。我们探究这种“和平共存”现象,假设模型使用了独立的语义特征与形态特征机制。通过采用增量学习设置作为诊断工具,我们测量了语义漂移并得到两个关键发现:首先,LOC(地点)实体因与新型PII存在表征重叠而具有独特脆弱性,因其共享类模式特征(如邮政编码);其次,我们识别出“反向O标签表征漂移”现象——初始训练将PII模式映射至'O'标签的模型会阻碍新知识学习,此问题仅通过解冻'O'标签分类器才得以解决,使背景类别能够自适应并“释放”这些模式。本研究为NER模型适应机制提供了系统性诊断,揭示了特征独立性、表征重叠与'O'标签可塑性等关键特性。研究工作基于https://www.papernest.com 收集的数据完成。