Data-efficient learning on graphs (GEL) is essential in real-world applications. Existing GEL methods focus on learning useful representations for nodes, edges, or entire graphs with ``small'' labeled data. But the problem of data-efficient learning for subgraph prediction has not been explored. The challenges of this problem lie in the following aspects: 1) It is crucial for subgraphs to learn positional features to acquire structural information in the base graph in which they exist. Although the existing subgraph neural network method is capable of learning disentangled position encodings, the overall computational complexity is very high. 2) Prevailing graph augmentation methods for GEL, including rule-based, sample-based, adaptive, and automated methods, are not suitable for augmenting subgraphs because a subgraph contains fewer nodes but richer information such as position, neighbor, and structure. Subgraph augmentation is more susceptible to undesirable perturbations. 3) Only a small number of nodes in the base graph are contained in subgraphs, which leads to a potential ``bias'' problem that the subgraph representation learning is dominated by these ``hot'' nodes. By contrast, the remaining nodes fail to be fully learned, which reduces the generalization ability of subgraph representation learning. In this paper, we aim to address the challenges above and propose a Position-Aware Data-Efficient Learning framework for subgraph neural networks called PADEL. Specifically, we propose a novel node position encoding method that is anchor-free, and design a new generative subgraph augmentation method based on a diffused variational subgraph autoencoder, and we propose exploratory and exploitable views for subgraph contrastive learning. Extensive experiment results on three real-world datasets show the superiority of our proposed method over state-of-the-art baselines.
翻译:图形( GEL) 的数据效率学习是真实世界应用中必不可少的。 现有的 GEL 方法侧重于学习节点、 边缘或带有“ 小” 标签数据的整张图形的有用表达方式。 但是, 尚未探讨用于子图预测的数据效率学习问题。 这个问题的挑战在于以下几个方面:(1) 子图学习定位特征以获取其所在的基图中的结构信息至关重要。 虽然现有的子图神经网络方法能够学习分解的位置编码,但总体计算复杂程度非常高。 (2) 用于 GEL 的节点、 边缘或带有“ 小” 标签数据的整张图形增强方法, 包括基于规则、 样本、 适应性和自动的图表。 但是, 用于子图的亚精度学习问题。 子图增强更易被不可取到基图中的定位。 基准图中只有少量的节点包含在子图中, 这可能导致子图中的“ 比值” 方法, 子图中显示的基数值定位位置, 我们的子图中显示的亚值定位值, 我们通过这些直径的缩缩缩图 显示的亚值 学习能力 。