Recently, researches have explored the graph neural network (GNN) techniques on text classification, since GNN does well in handling complex structures and preserving global information. However, previous methods based on GNN are mainly faced with the practical problems of fixed corpus level graph structure which do not support online testing and high memory consumption. To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. This method removes the burden of dependence between an individual text and entire corpus which support online testing, but still preserve global information. Besides, we build graphs by much smaller windows in the text, which not only extract more local features but also significantly reduce the edge numbers as well as memory consumption. Experiments show that our model outperforms existing models on several text classification datasets even with consuming less memory.
翻译:最近,一些研究探讨了关于文本分类的图形神经网络(GNN)技术,因为GNN在处理复杂结构和维护全球信息方面做得很好,然而,以前基于GNN的方法主要面临固定的物理级图形结构的实际问题,这些结构不支持在线测试和高内存消耗。为了解决这些问题,我们提出了一个新的基于GNN的模型,为每个输入文本建立图表,同时共享全球参数,而不是为整个系统提供一个单一的图表。这种方法消除了单个文本与支持在线测试但仍保存全球信息的整个文件之间依赖性的负担。此外,我们用文本中小得多的窗口建立图表,不仅提取了更多的本地特征,而且还大大减少了边缘数和记忆消耗量。实验表明,我们的模型在几个文本分类数据集上比现有的模型要强,即使用较少的记忆。