SimGRACE:一个不增加数据的图表对比学习简单框架 (SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation)

Graph contrastive learning (GCL) has emerged as a dominant technique for graph representation learning which maximizes the mutual information between paired graph augmentations that share the same semantics. Unfortunately, it is difficult to preserve semantics well during augmentations in view of the diverse nature of graph data. Currently, data augmentations in GCL that are designed to preserve semantics broadly fall into three unsatisfactory ways. First, the augmentations can be manually picked per dataset by trial-and-errors. Second, the augmentations can be selected via cumbersome search. Third, the augmentations can be obtained by introducing expensive domain-specific knowledge as guidance. All of these limit the efficiency and more general applicability of existing GCL methods. To circumvent these crucial issues, we propose a \underline{Sim}ple framework for \underline{GRA}ph \underline{C}ontrastive l\underline{E}arning, \textbf{SimGRACE} for brevity, which does not require data augmentations. Specifically, we take original graph as input and GNN model with its perturbed version as two encoders to obtain two correlated views for contrast. SimGRACE is inspired by the observation that graph data can preserve their semantics well during encoder perturbations while not requiring manual trial-and-errors, cumbersome search or expensive domain knowledge for augmentations selection. Also, we explain why SimGRACE can succeed. Furthermore, we devise adversarial training scheme, dubbed \textbf{AT-SimGRACE}, to enhance the robustness of graph contrastive learning and theoretically explain the reasons. Albeit simple, we show that SimGRACE can yield competitive or better performance compared with state-of-the-art methods in terms of generalizability, transferability and robustness, while enjoying unprecedented degree of flexibility and efficiency.

翻译：对比图形学习( GCL) 已成为一种主要的图形代表学习技术, 使共享相同语义的配对图形扩增之间的相互信息最大化。不幸的是, 由于图形数据性质的多样性, 在扩增期间很难保持语义。目前, 用于保存语义的 GCL 中的数据扩增可分为三种不令人满意的方式。首先, 扩增可以人工选择由试和试的数据集。其次, 扩增可以通过繁琐的搜索来选择。第三, 可以通过引入昂贵的域域特知识作为指导来获取扩增。所有这些都限制了现有 GCL 方法的效率和更广泛的适用性。为了绕过这些关键问题, 我们提议了一个用于保存语义的线( GRA) (Simline{Simm}Plement 框架, 用于保存线性能的调增扩增) (anderline {GRA) {C} 。增扩增量可手工增量( textfforlation) 、普通 {SimgralAE} (不需要数据扩增。具体而言, 我们使用原始图表的输入和GNNEAR ASlal AS AS IMAC IMAC 的变数模型, 变化的变压的变压的变数, 。