Real world data is mostly unlabeled or only few instances are labeled. Manually labeling data is a very expensive and daunting task. This calls for unsupervised learning techniques that are powerful enough to achieve comparable results as semi-supervised/supervised techniques. Contrastive self-supervised learning has emerged as a powerful direction, in some cases outperforming supervised techniques. In this study, we propose, SelfGNN, a novel contrastive self-supervised graph neural network (GNN) without relying on explicit contrastive terms. We leverage Batch Normalization, which introduces implicit contrastive terms, without sacrificing performance. Furthermore, as data augmentation is key in contrastive learning, we introduce four feature augmentation (FA) techniques for graphs. Though graph topological augmentation (TA) is commonly used, our empirical findings show that FA perform as good as TA. Moreover, FA incurs no computational overhead, unlike TA, which often has O(N^3) time complexity, N-number of nodes. Our empirical evaluation on seven publicly available real-world data shows that, SelfGNN is powerful and leads to a performance comparable with SOTA supervised GNNs and always better than SOTA semi-supervised and unsupervised GNNs. The source code is available at https://github.com/zekarias-tilahun/SelfGNN.
翻译:实际世界数据大多没有标签, 或只有极少数数据被贴上标签。 手工标签标签数据是一项非常昂贵和艰巨的任务。 这要求使用不受监督的学习技术,这些技术足够强大,足以作为半监督/监督的技术实现可比结果。 对比性自我监督的学习已经成为一个强有力的方向, 在某些情况下, 优于业绩监督的技术。 在这次研究中, 我们提议, SelgNN, 一个新型的对比性自我监督的图形神经网络( GNNN), 而不依赖于明确的对比性条件。 我们利用批量正常化, 引入隐含的对比性术语, 而不牺牲性能。 此外, 由于数据增强是对比性学习的关键, 我们对图表采用了四种功能增强( FA) 技术。 尽管通常使用图示性增强(TA), 我们的经验调查结果显示, FA 与 TA 不同, 它通常有O( NQ3) 时间复杂性, Ndes 数量。 我们对7个公开存在的真实世界数据进行的经验性评估显示, SelfGNNNN- 和导致SOTAG GNG GNG 源的可一直比SODR/ GNIS 更好可比。