预训练图神经网络的最佳时机：基于数据生成的视角的答案 (When to Pre-Train Graph Neural Networks? An Answer from Data Generation Perspective!)

Recently, graph pre-training has attracted wide research attention, which aims to learn transferable knowledge from unlabeled graph data so as to improve downstream performance. Despite these recent attempts, the negative transfer is a major issue when applying graph pre-trained models to downstream tasks. Existing works made great efforts on the issue of what to pre-train and how to pre-train by designing a number of graph pre-training and fine-tuning strategies. However, there are indeed cases where no matter how advanced the strategy is, the "pre-train and fine-tune" paradigm still cannot achieve clear benefits. This paper introduces a generic framework W2PGNN to answer the crucial question of when to pre-train (i.e., in what situations could we take advantage of graph pre-training) before performing effortful pre-training or fine-tuning. We start from a new perspective to explore the complex generative mechanisms from the pre-training data to downstream data. In particular, W2PGNN first fits the pre-training data into graphon bases, each element of graphon basis (i.e., a graphon) identifies a fundamental transferable pattern shared by a collection of pre-training graphs. All convex combinations of graphon bases give rise to a generator space, from which graphs generated form the solution space for those downstream data that can benefit from pre-training. In this manner, the feasibility of pre-training can be quantified as the generation probability of the downstream data from any generator in the generator space. W2PGNN provides three broad applications, including providing the application scope of graph pre-trained models, quantifying the feasibility of performing pre-training, and helping select pre-training data to enhance downstream performance. We give a theoretically sound solution for the first application and extensive empirical justifications for the latter two applications.

翻译：近年来，图预训练引起了广泛关注，旨在从未标记的图数据中学习可转移的知识，以提高下游任务的性能。尽管有这些最新的尝试，但是将图预先训练的模型应用于下游任务时可能存在负面传递的主要问题。现有的研究致力于通过设计大量的图预训练和微调策略来解决什么预训练和如何预训练的问题。然而，实际情况下确实存在这样的情况，即无论策略多么先进，“预训练和微调”的范式仍然无法实现明显的好处。本文介绍了一个通用框架 W2PGNN，以回答关键问题——何时进行预训练（即在什么情况下可以利用图预训练）；在进行繁琐的预训练或微调之前需要作出决定。我们从一个新的角度出发，探索从预训练数据到下游数据的复杂生成机制。特别是，W2PGNN首先将预训练数据拟合到图基组中，图基组的每个元素（即图基）都识别一种基本的可转移模式，这些模式由一组预训练图共享。图基组的所有凸组合形成一个生成器空间，从中产生的图形成为那些从预训练中受益的下游数据的解决方案空间。通过这种方式，可以将预训练的可行性量化为来自生成器空间中任何生成器的下游数据的生成概率。W2PGNN提供了三个广泛的应用，包括提供图预训练模型的应用范围，量化执行预训练的可行性以及帮助选择预训练数据以增强下游性能。我们为第一个应用提供了一个理论上合理的解决方案，并为后两个应用提供了广泛的经验证明。