To better handle long-tail cases in the sequence labeling (SL) task, in this work, we introduce graph neural networks sequence labeling (GNN-SL), which augments the vanilla SL model output with similar tagging examples retrieved from the whole training set. Since not all the retrieved tagging examples benefit the model prediction, we construct a heterogeneous graph, and leverage graph neural networks (GNNs) to transfer information between the retrieved tagging examples and the input word sequence. The augmented node which aggregates information from neighbors is used to do prediction. This strategy enables the model to directly acquire similar tagging examples and improves the general quality of predictions. We conduct a variety of experiments on three typical sequence labeling tasks: Named Entity Recognition (NER), Part of Speech Tagging (POS), and Chinese Word Segmentation (CWS) to show the significant performance of our GNN-SL. Notably, GNN-SL achieves SOTA results of 96.9 (+0.2) on PKU, 98.3 (+0.4) on CITYU, 98.5 (+0.2) on MSR, and 96.9 (+0.2) on AS for the CWS task, and results comparable to SOTA performances on NER datasets, and POS datasets.
翻译:为了更好地处理序列标签(SL)任务中的长尾案例,在这项工作中,我们引入了图形神经网络序列标签(GNN-SL),通过从整个培训组中提取的类似标记示例来增强香草 SL模型输出,因为并非所有检索的标记示例都有利于模型预测,所以我们建造了一个多元图,并利用图形神经网络(GNNS)在检索的标记示例和输入字序列之间传输信息。使用强化节点来汇总邻居的信息来进行预测。这一战略使模型能够直接获取类似的标记示例,并提高预测的一般质量。我们就三种典型序列标签任务进行了各种实验:名称实体识别(NER)、部分语音拖动(POS)和中国文字分割(CWS),以显示我们的GNN-SL的重要性能。值得注意的是,GNN-S在PU上实现了96.9(+0.2)、98.3(+0.4)、CityU、98.5(+0.2)、MSR、96.9(+OS)和SAS的可比较性数据结果。