Graph self-supervised learning has gained increasing attention due to its capacity to learn expressive node representations. Many pretext tasks, or loss functions have been designed from distinct perspectives. However, we observe that different pretext tasks affect downstream tasks differently cross datasets, which suggests that searching pretext tasks is crucial for graph self-supervised learning. Different from existing works focusing on designing single pretext tasks, this work aims to investigate how to automatically leverage multiple pretext tasks effectively. Nevertheless, evaluating representations derived from multiple pretext tasks without direct access to ground truth labels makes this problem challenging. To address this obstacle, we make use of a key principle of many real-world graphs, i.e., homophily, or the principle that ``like attracts like,'' as the guidance to effectively search various self-supervised pretext tasks. We provide theoretical understanding and empirical evidence to justify the flexibility of homophily in this search task. Then we propose the AutoSSL framework which can automatically search over combinations of various self-supervised tasks. By evaluating the framework on 7 real-world datasets, our experimental results show that AutoSSL can significantly boost the performance on downstream tasks including node clustering and node classification compared with training under individual tasks. Code will be released at https://github.com/ChandlerBang/AutoSSL.
翻译:自我监督的图形学习因其学习直观节点表示的能力而日益受到越来越多的关注。许多托辞任务或损失功能是从不同角度设计的。然而,我们注意到,不同的托辞任务影响下游任务,而跨数据集则不同,这表明寻找托辞任务对于图形自监督的学习至关重要。与当前侧重于设计单一托辞任务的工作不同,这项工作旨在调查如何自动利用多种托辞任务。然而,由于评价来自多个托辞任务,而没有直接接触地面真相标签,因此这一问题具有挑战性。为了克服这一障碍,我们使用许多真实世界图表的关键原则,即同质图,或“像吸引一样”的原则,作为有效搜索各种自我监督的托辞任务的指南。我们提供理论理解和经验证据,证明这一搜索任务具有一致性的灵活性。然后我们提出AutoSSL框架,可以自动搜索各种自我监督任务的组合。通过对7个真实世界数据集的框架进行评估,我们实验结果显示,AutoSSL将大大推进下游任务的业绩,包括无Banqual/tototooal 任务。我们提供ASSL将大大推进下没有BanA/BanCtroaustrual