Recently, text classification model based on graph neural network (GNN) has attracted more and more attention. Most of these models adopt a similar network paradigm, that is, using pre-training node embedding initialization and two-layer graph convolution. In this work, we propose TextRGNN, an improved GNN structure that introduces residual connection to deepen the convolution network depth. Our structure can obtain a wider node receptive field and effectively suppress the over-smoothing of node features. In addition, we integrate the probabilistic language model into the initialization of graph node embedding, so that the non-graph semantic information of can be better extracted. The experimental results show that our model is general and efficient. It can significantly improve the classification accuracy whether in corpus level or text level, and achieve SOTA performance on a wide range of text classification datasets.
翻译:最近,基于图形神经网络(GNN)的文本分类模型吸引了越来越多的注意力。这些模型大多采用了类似的网络模式,即使用培训前节点嵌入初始化和两层图变。在这项工作中,我们提议采用改进后的GNN结构Text,即TextRGNN结构,引入剩余连接,以深化卷发网络深度。我们的结构可以获得更广泛的节点可接受字段,并有效抑制节点特征的过度移动。此外,我们把概率语言模型纳入图形节点嵌入初始化中,这样可以更好地提取非文字语义信息。实验结果显示,我们的模型是一般性的和高效的。它可以大大提高分类精确度,无论是在物质级别还是文本级别,并在广泛的文本分类数据集上实现SOTA的性能。