Recently, word enhancement has become very popular for Chinese Named Entity Recognition (NER), reducing segmentation errors and increasing the semantic and boundary information of Chinese words. However, these methods tend to ignore the information of the Chinese character structure after integrating the lexical information. Chinese characters have evolved from pictographs since ancient times, and their structure often reflects more information about the characters. This paper presents a novel Multi-metadata Embedding based Cross-Transformer (MECT) to improve the performance of Chinese NER by fusing the structural information of Chinese characters. Specifically, we use multi-metadata embedding in a two-stream Transformer to integrate Chinese character features with the radical-level embedding. With the structural characteristics of Chinese characters, MECT can better capture the semantic information of Chinese characters for NER. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits and superiority of the proposed MECT method.\footnote{The source code of the proposed method is publicly available at https://github.com/CoderMusou/MECT4CNER.
翻译:最近,中国命名实体识别(NER)非常流行用字强化,减少了分化错误,增加了中文词的语义和边界信息,然而,这些方法往往忽略了将词汇信息整合后中国字符结构的信息。中国字符从古代的图片学中演变而来,其结构往往反映更多有关字符的信息。本文展示了一个新的多元数据嵌入基于跨跨跨的嵌入数据(MECT),通过使用中文字符的结构信息改善中国网络的绩效。具体地说,我们使用双流变压器嵌入多元元数据,将中国字符特征与激进层嵌入一体。根据中国字符的结构特征,MECT可以更好地捕捉中国字符的语义信息。从几个著名基准数据集获得的实验结果显示了拟议MECT方法的优点和优越性。\footte{拟议方法的来源代码可在https://github.com/CoderMusou/MECT4CNER上公开查阅。