Given the input graph and its label/property, several key problems of graph learning, such as finding interpretable subgraphs, graph denoising and graph compression, can be attributed to the fundamental problem of recognizing a subgraph of the original one. This subgraph shall be as informative as possible, yet contains less redundant and noisy structure. This problem setting is closely related to the well-known information bottleneck (IB) principle, which, however, has less been studied for the irregular graph data and graph neural networks (GNNs). In this paper, we propose a framework of Graph Information Bottleneck (GIB) for the subgraph recognition problem in deep graph learning. Under this framework, one can recognize the maximally informative yet compressive subgraph, named IB-subgraph. However, the GIB objective is notoriously hard to optimize, mostly due to the intractability of the mutual information of irregular graph data and the unstable optimization process. In order to tackle these challenges, we propose: i) a GIB objective based-on a mutual information estimator for the irregular graph data; ii) a bi-level optimization scheme to maximize the GIB objective; iii) a connectivity loss to stabilize the optimization process. We evaluate the properties of the IB-subgraph in three application scenarios: improvement of graph classification, graph interpretation and graph denoising. Extensive experiments demonstrate that the information-theoretic IB-subgraph enjoys superior graph properties.
翻译:鉴于输入图及其标签/财产,图形学习的若干关键问题,如寻找可解释的子集、图表分解和图形压缩等,可归因于识别原始图的子集的根本问题。该子集应尽量提供信息,但应包含较少冗余和噪音的结构。这个问题设置与众所周知的信息瓶颈原则密切相关,然而,对非正常图数据和图神经网络的研究较少。在本文中,我们提议为深图学习中的子图识别问题建立一个图表信息博特勒克(GIB)框架。在这个框架内,人们可以认识到最大但最压缩的子集,名为IB子集。然而,GIB的目标很难优化,这主要是因为不规则图数据的相互信息不易被使用和不稳定的优化过程。为了应对这些挑战,我们提议:一)GIB目标目标,为不定期图表数据的相互信息估量;二)在IB图中,对高级图的精确性能应用进行最优化;三)在IB图中,我们对IB目标级图的精确性平面图的精确度分析,对IB的精确度进行最优化计划进行最优化。