The potential impact of a paper is often quantified by how many citations it will receive. However, most commonly used models may underestimate the influence of newly published papers over time, and fail to encapsulate this dynamics of citation network into the graph. In this study, we construct hierarchical and heterogeneous graphs for target papers with an annual perspective. The constructed graphs can record the annual dynamics of target papers' scientific context information. Then, a novel graph neural network, Hierarchical and Heterogeneous Contrastive Graph Learning Model (H2CGL), is proposed to incorporate heterogeneity and dynamics of the citation network. H2CGL separately aggregates the heterogeneous information for each year and prioritizes the highly-cited papers and relationships among references, citations, and the target paper. It then employs a weighted GIN to capture dynamics between heterogeneous subgraphs over years. Moreover, it leverages contrastive learning to make the graph representations more sensitive to potential citations. Particularly, co-cited or co-citing papers of the target paper with large citation gap are taken as hard negative samples, while randomly dropping low-cited papers could generate positive samples. Extensive experimental results on two scholarly datasets demonstrate that the proposed H2CGL significantly outperforms a series of baseline approaches for both previously and freshly published papers. Additional analyses highlight the significance of the proposed modules. Our codes and settings have been released on Github (https://github.com/ECNU-Text-Computing/H2CGL)
翻译:本研究中,我们使用年度视角为论文构建分层异构图,从而记录目标论文科学上下文信息的年度动态。针对主流模型中无法将新文献对时间的影响因素转化成图形的缺陷,我们提出了一种新颖的图神经网络模型:分层异构对比图学习模型(Hierarchical and Heterogeneous Contrastive Graph Learning Model,H2CGL)。该模型通过分年度聚合异构信息,优先考虑高引文献和参考文献、引用文献与目标论文之间的联系,使用内积池化或MLP池化编码子图特征,采用含权多层感知机抓取不同年度认知异质分支与目标异质分支之间的关系,并结合对比学习增强图表示的灵敏度。与已有模型相比,H2CGL不仅用于预测新论文的影响因子,同时对早期论文也表现优异,这在两个学术数据集上得到实验证明。同时,该模型拥有高耐噪能力,并可适用于许多有用的下游任务。我们的代码和设置已在Github(https://github.com/ECNU-Text-Computing/H2CGL)上公开。