深度学习中图神经网络何时进行预训练：来自数据生成视角的答案！ (When to Pre-Train Graph Neural Networks? An Answer from Data Generation Perspective!)

from arxiv, This paper was withdrawn because it was submitted without the consent of one of the co-authors. It does not contain any errors that need to be corrected

Recently, graph pre-training has attracted wide research attention, which aims to learn transferable knowledge from unlabeled graph data so as to improve downstream performance. Despite these recent attempts, the negative transfer is a major issue when applying graph pre-trained models to downstream tasks. Existing works made great efforts on the issue of what to pre-train and how to pre-train by designing a number of graph pre-training and fine-tuning strategies. However, there are indeed cases where no matter how advanced the strategy is, the "pre-train and fine-tune" paradigm still cannot achieve clear benefits. This paper introduces a generic framework W2PGNN to answer the crucial question of when to pre-train (i.e., in what situations could we take advantage of graph pre-training) before performing effortful pre-training or fine-tuning. We start from a new perspective to explore the complex generative mechanisms from the pre-training data to downstream data. In particular, W2PGNN first fits the pre-training data into graphon bases, each element of graphon basis (i.e., a graphon) identifies a fundamental transferable pattern shared by a collection of pre-training graphs. All convex combinations of graphon bases give rise to a generator space, from which graphs generated form the solution space for those downstream data that can benefit from pre-training. In this manner, the feasibility of pre-training can be quantified as the generation probability of the downstream data from any generator in the generator space. W2PGNN provides three broad applications, including providing the application scope of graph pre-trained models, quantifying the feasibility of performing pre-training, and helping select pre-training data to enhance downstream performance. We give a theoretically sound solution for the first application and extensive empirical justifications for the latter two applications.

翻译：最近，图预训练引起了广泛的研究关注，旨在从未标注的图数据中学习可迁移知识，以提高下游性能。尽管这些最近的尝试，负迁移是将图预训练模型应用于下游任务的主要问题。现有的工作通过设计许多图预训练和微调策略，在如何预训练和预训练什么方面做出了巨大努力。然而，确实存在一些情况，无论策略多么先进，“预训练和微调”范例仍然无法带来明显的好处。本文从新的角度出发，探索了从预训练数据到下游数据的复杂生成机制，引入了一个通用框架W2PGNN，以回答何时进行预训练的关键问题（即在什么情况下我们可以利用图预训练）。特别地，W2PGNN先将预训练数据拟合到图上函数基，并且它们各自识别出一个由一些预训练图共享的基本可迁移模式。图上函数基的所有凸组合构成了生成空间，其中从任何生成器产生的图形都是可以从预训练受益的下游数据的解空间。通过这种方式，可以将预训练的可行性量化为下游数据从生成空间中的任何生成器中生成的概率。 W2PGNN提供三个广泛的应用程序，包括提供图预训练模型的应用范围，量化执行预训练的可行性，以及帮助选择预训练数据以增强下游性能。我们对第一个应用提供了理论上的解决方案，并对后两个应用进行了广泛的实证证明。