Neural named entity recognition (NER) models may easily encounter the over-confidence issue, which degrades the performance and calibration. Inspired by label smoothing and driven by the ambiguity of boundary annotation in NER engineering, we propose boundary smoothing as a regularization technique for span-based neural NER models. It re-assigns entity probabilities from annotated spans to the surrounding ones. Built on a simple but strong baseline, our model achieves results better than or competitive with previous state-of-the-art systems on eight well-known NER benchmarks. Further empirical analysis suggests that boundary smoothing effectively mitigates over-confidence, improves model calibration, and brings flatter neural minima and more smoothed loss landscapes.
翻译:以神经命名的实体识别(NER)模型很容易遇到过度自信问题,它会降低性能和校准。受光滑标签的启发和净化工程中边界说明模糊的驱动,我们建议将边界平滑作为跨基神经净化模型的正规化技术。它将实体的概率从注解范围重新分配到周围。在简单但强有力的基线上,我们的模型在八项众所周知的NER基准上比以往最先进的系统取得更好的或更具有竞争力。进一步的实证分析表明,边界平滑可以有效地减轻过度信任,改进模型校准,并带来平滑的神经微型和更加平滑的损失景观。