This paper proposes hybrid semi-Markov conditional random fields (SCRFs) for neural sequence labeling in natural language processing. Based on conventional conditional random fields (CRFs), SCRFs have been designed for the tasks of assigning labels to segments by extracting features from and describing transitions between segments instead of words. In this paper, we improve the existing SCRF methods by employing word-level and segment-level information simultaneously. First, word-level labels are utilized to derive the segment scores in SCRFs. Second, a CRF output layer and an SCRF output layer are integrated into an unified neural network and trained jointly. Experimental results on CoNLL 2003 named entity recognition (NER) shared task show that our model achieves state-of-the-art performance when no external knowledge is used.
翻译:本文件提出自然语言处理中神经序列标签的混合半马尔科夫有条件随机域(SCRFs),根据传统的有条件随机域(CRFs),设计了SCRF(SCRF),通过从各段之间取出特征,描述各段之间而不是文字之间的过渡,为各段分配标签。在本文件中,我们通过同时使用字级和分级信息,改进现有的SCRF方法。首先,使用字级标签来得出SCRF的分数。第二,将通用报告格式产出层和SCRF产出层纳入统一的神经网络,并联合培训。CONLL 2003 命名实体识别(NER)的实验结果显示,在没有使用外部知识的情况下,我们的模型实现了最新业绩。