Since most scientific literature data are unlabeled, this makes unsupervised graph-based semantic representation learning crucial. Therefore, an unsupervised semantic representation learning method of scientific literature based on graph attention mechanism and maximum mutual information (GAMMI) is proposed. By introducing a graph attention mechanism, the weighted summation of nearby node features make the weights of adjacent node features entirely depend on the node features. Depending on the features of the nearby nodes, different weights can be applied to each node in the graph. Therefore, the correlations between vertex features can be better integrated into the model. In addition, an unsupervised graph contrastive learning strategy is proposed to solve the problem of being unlabeled and scalable on large-scale graphs. By comparing the mutual information between the positive and negative local node representations on the latent space and the global graph representation, the graph neural network can capture both local and global information. Experimental results demonstrate competitive performance on various node classification benchmarks, achieving good results and sometimes even surpassing the performance of supervised learning.
翻译:由于大多数科学文献数据没有标签,这使得未经监督的基于图形的语义表达式学习变得至关重要。 因此, 提出了一个基于图形关注机制和最大相互信息的科学文献不受监督的语义表达式学习方法( GAMMI) 。 通过引入图形关注机制, 附近节点特征的加权加和使相邻节点特征的重量完全取决于节点特征。 取决于附近节点的特征, 不同的权重可以适用于图形中的每个节点。 因此, 顶端特征之间的关联可以更好地融入模型。 此外, 提出了一种未经监督的图形对比学习战略, 以解决无标签和大规模图形可缩放的问题。 通过比较潜在空间和全球图形代表的正负本地节点表达式之间的相互信息, 图形神经网络可以捕捉本地和全球信息。 实验结果显示, 各种节点分类基准的竞争性表现、 取得良好结果, 有时甚至超过监督学习的绩效。