假假的: 链接预测中的加速数据集移位 (FakeEdge: Alleviate Dataset Shift in Link Prediction)

Link prediction is a crucial problem in graph-structured data. Due to the recent success of graph neural networks (GNNs), a variety of GNN-based models were proposed to tackle the link prediction task. Specifically, GNNs leverage the message passing paradigm to obtain node representation, which relies on link connectivity. However, in a link prediction task, links in the training set are always present while ones in the testing set are not yet formed, resulting in a discrepancy of the connectivity pattern and bias of the learned representation. It leads to a problem of dataset shift which degrades the model performance. In this paper, we first identify the dataset shift problem in the link prediction task and provide theoretical analyses on how existing link prediction methods are vulnerable to it. We then propose FakeEdge, a model-agnostic technique, to address the problem by mitigating the graph topological gap between training and testing sets. Extensive experiments demonstrate the applicability and superiority of FakeEdge on multiple datasets across various domains.

翻译：链接预测是图表结构数据中的一个关键问题。由于图表神经网络(GNNs)最近的成功,提出了各种基于GNN的模型来应对链接预测任务。具体地说,GNNS利用信息传递模式来获取节点代表,这依赖于链接连接。然而,在一个链接预测任务中,培训数据集中总是存在链接,而测试数据集中的链接尚未形成,从而导致连接模式和所学代表性的偏差不一致。这导致了数据集转换问题,使模型性能下降。在本文中,我们首先确定了链接预测任务中的数据集转换问题,并对现有链接预测方法的脆弱性进行了理论分析。我们然后提出FakeEdge,这是一种模型-不可知性技术,通过缩小培训和测试数据集之间的图表表层差距来解决这一问题。广泛的实验表明FakeEdge在不同领域多个数据集上的应用性和优越性。

相关内容

链路预测

关注 14

网络中的链路预测(Link Prediction)是指如何通过已知的网络节点以及网络结构等信息预测网络中尚未产生连边的两个节点之间产生链接的可能性。这种预测既包含了对未知链接（exist yet unknown links）的预测也包含了对未来链接（future links）的预测。该问题的研究在理论和应用两个方面都具有重要的意义和价值。

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【CIKM2019 Tutorial】Recent Developments of Deep Heterogeneous Information Network Analysis（深度异构信息网络分析的最新进展），附157页PDF免费下载

专知会员服务

29+阅读 · 2019年11月3日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日