Deep learning on graphs has recently achieved remarkable success on a variety of tasks, while such success relies heavily on the massive and carefully labeled data. However, precise annotations are generally very expensive and time-consuming. To address this problem, self-supervised learning (SSL) is emerging as a new paradigm for extracting informative knowledge through well-designed pretext tasks without relying on manual labels. In this survey, we extend the concept of SSL, which first emerged in the fields of computer vision and natural language processing, to present a timely and comprehensive review of existing SSL techniques for graph data. Specifically, we divide existing graph SSL methods into three categories: contrastive, generative, and predictive. More importantly, unlike other surveys that only provide a high-level description of published research, we present an additional mathematical summary of existing works in a unified framework. Furthermore, to facilitate methodological development and empirical comparisons, we also summarize the commonly used datasets, evaluation metrics, downstream tasks, open-source implementations, and experimental study of various algorithms. Finally, we discuss the technical challenges and potential future directions for improving graph self-supervised learning. Latest advances in graph SSL are summarized in a GitHub repository https://github.com/LirongWu/awesome-graph-self-supervised-learning.
翻译:图表上的深层学习最近在许多任务上取得了显著的成功,而这种成功在很大程度上依赖于大量和仔细标记的数据,但准确的说明通常非常昂贵和费时。为了解决这一问题,自我监督的学习(SSL)正在成为一种新的范例,通过精心设计的借口任务获取信息知识,而不必依靠人工标签。在本次调查中,我们扩大了在计算机视觉和自然语言处理领域首次出现的SSL概念,以便及时和全面地审查用于图表数据的现有SSL技术。具体地说,我们将现有的图形SSL方法分为三类:对比性、基因化和预测性。更重要的是,与其他只提供已公布研究高层次描述的调查不同,我们在一个统一的框架内对现有的工作提出额外的数学摘要。此外,为了便利方法发展和实验性比较,我们还总结了常用的数据集、评价指标、下游任务、开源实施以及各种算法的实验性研究。最后,我们讨论了改进图表自我监督的SLSLSLSAR/SARSAS高级学习的技术性挑战和潜在未来方向。在图形中,正在总结SLOB/SLSLSLSAR的最近进展。