以“元病”为基础的通过结构信息自行监督的通过结构信息学习异基因图 (Self-supervised Learning for Heterogeneous Graph via Structure Information based on Metapath)

graph neural networks (GNNs) are the dominant paradigm for modeling and handling graph structure data by learning universal node representation. The traditional way of training GNNs depends on a great many labeled data, which results in high requirements on cost and time. In some special scene, it is even unavailable and impracticable. Self-supervised representation learning, which can generate labels by graph structure data itself, is a potential approach to tackle this problem. And turning to research on self-supervised learning problem for heterogeneous graphs is more challenging than dealing with homogeneous graphs, also there are fewer studies about it. In this paper, we propose a SElfsupervised learning method for heterogeneous graph via Structure Information based on Metapath (SESIM). The proposed model can construct pretext tasks by predicting jump number between nodes in each metapath to improve the representation ability of primary task. In order to predict jump number, SESIM uses data itself to generate labels, avoiding time-consuming manual labeling. Moreover, predicting jump number in each metapath can effectively utilize graph structure information, which is the essential property between nodes. Therefore, SESIM deepens the understanding of models for graph structure. At last, we train primary task and pretext tasks jointly, and use meta-learning to balance the contribution of pretext tasks for primary task. Empirical results validate the performance of SESIM method and demonstrate that this method can improve the representation ability of traditional neural networks on link prediction task and node classification task.

翻译：图形神经网络(GNNs)是通过学习通用节点表示法来建模和处理图形结构数据的主导模式。传统的培训GNNs的方法取决于大量标签数据, 这些数据导致成本和时间要求高。在一些特殊场景中, 甚至甚至没有和不可行。以图形结构数据本身生成标签的自我监督的演示学习是解决这一问题的一种潜在方法。转向以自我监督的方式研究不同图形的学习问题比处理同质图表更具挑战性, 也很少有关于它的研究。在本文中, 我们建议了一种Selfde 超级超额学习方法, 用于通过基于Metapath(SESIM)的结构信息绘制异性图表。拟议的模型可以通过预测每个元路径的节点之间的跳动数字来构建借口任务, 以提高主要任务的代表性能力。为了预测跳动数字, SESIM 使用数据本身生成标签, 避免花费时间的手工标签。此外, 预测每个元路径中的跳动数字可以有效地使用图形结构信息, 这是在节点之间的关键属性, 我们用SESIM 模型来测试SIM 度任务中, 联合测试SIM 度任务的模型, 格式任务的模型解释任务。