Self-supervised learning provides a promising path towards eliminating the need for costly label information in representation learning on graphs. However, to achieve state-of-the-art performance, methods often need large numbers of negative examples and rely on complex augmentations. This can be prohibitively expensive, especially for large graphs. To address these challenges, we introduce Bootstrapped Graph Latents (BGRL) - a graph representation learning method that learns by predicting alternative augmentations of the input. BGRL uses only simple augmentations and alleviates the need for contrasting with negative examples, and is thus scalable by design. BGRL outperforms or matches prior methods on several established benchmarks, while achieving a 2-10x reduction in memory costs. Furthermore, we show that BGRL can be scaled up to extremely large graphs with hundreds of millions of nodes in the semi-supervised regime - achieving state-of-the-art performance and improving over supervised baselines where representations are shaped only through label information. In particular, our solution centered on BGRL constituted one of the winning entries to the Open Graph Benchmark - Large Scale Challenge at KDD Cup 2021, on a graph orders of magnitudes larger than all previously available benchmarks, thus demonstrating the scalability and effectiveness of our approach.
翻译:自我监督的学习是消除图示学习中昂贵标签信息需求的一个有希望的道路。然而,为了实现最先进的业绩,方法往往需要大量负面实例,并依赖复杂的增强功能。这可能非常昂贵,特别是大图。为了应对这些挑战,我们引入了一个图示代表学习方法,通过预测投入的替代性增强来学习。BGRL仅使用简单的增强功能,并减轻与负面实例对比的需要,因此可以通过设计加以推广。BGRL在几个既定基准上超越或匹配先前的方法,同时降低记忆成本2-10x。此外,我们表明BGRL可以扩大为极大图表,在半监督制度中有数亿个节点,达到最新业绩,改进监督基线,只有通过标签信息才能确定表述。特别是,我们位于BGRL的解决方案构成了公开图表基准的优选条目之一,从而在2021年杯上展示了我们之前的更大规模标准,从而展示了我们之前的更大规模的标准。