Named entity recognition (NER) remains challenging when entity mentions can be discontinuous. Existing methods break the recognition process into several sequential steps. In training, they predict conditioned on the golden intermediate results, while at inference relying on the model output of the previous steps, which introduces exposure bias. To solve this problem, we first construct a segment graph for each sentence, in which each node denotes a segment (a continuous entity on its own, or a part of discontinuous entities), and an edge links two nodes that belong to the same entity. The nodes and edges can be generated respectively in one stage with a grid tagging scheme and learned jointly using a novel architecture named Mac. Then discontinuous NER can be reformulated as a non-parametric process of discovering maximal cliques in the graph and concatenating the spans in each clique. Experiments on three benchmarks show that our method outperforms the state-of-the-art (SOTA) results, with up to 3.5 percentage points improvement on F1, and achieves 5x speedup over the SOTA model.
翻译:当实体提及时,命名实体识别(NER)仍然具有挑战性。现有的方法可以将识别进程分成几个相继步骤。在培训中,它们预测以黄金中间结果为条件,同时预测取决于前几个步骤的模型输出结果,从而引入暴露偏差。为了解决这个问题,我们首先为每个句子绘制一个区段图,其中每个节点代表一个区段(一个连续实体,或一个不连续实体的一部分)和属于同一实体的两个节点的边缘链接。节点和边缘可以分别在一个阶段产生,用一个网格标记计划,并用一个名为Mac的新结构共同学习。然后,不连续的NER可以重新拟订为一个非参数过程,在图形中发现最大晶点,在每一个区段的宽度之间配置一个非参数。对三个基准的实验表明,我们的方法超越了艺术(SOTA)状态的结果,在F1上达到3.5个百分点的改进点,并在SOTA模型上达到5x速度。