The key to the text classification task is language representation and important information extraction, and there are many related studies. In recent years, the research on graph neural network (GNN) in text classification has gradually emerged and shown its advantages, but the existing models mainly focus on directly inputting words as graph nodes into the GNN models ignoring the different levels of semantic structure information in the samples. To address the issue, we propose a new hierarchical graph neural network (HieGNN) which extracts corresponding information from word-level, sentence-level and document-level respectively. Experimental results on several benchmark datasets achieve better or similar results compared to several baseline methods, which demonstrate that our model is able to obtain more useful information for classification from samples.
翻译:文本分类任务的关键是语言代表性和重要的信息提取,并有许多相关研究。近年来,文本分类中的图形神经网络研究逐渐出现,显示出其优势,但现有模型主要侧重于将文字作为图形节点直接输入GNN模型,而忽略了样本中不同层次的语义结构信息。为解决这一问题,我们提议建立一个新的等级级图形神经网络(HieGNN),分别从字级、判决级和文件级提取相应的信息。几个基准数据集的实验结果与几个基线方法相比,取得了更好或类似的结果,这表明我们的模型能够从样本中获取更有用的信息。