The emergence of Graph Convolutional Network (GCN) has greatly boosted the progress of graph learning. However, two disturbing factors, noise and redundancy in graph data, and lack of interpretation for prediction results, impede further development of GCN. One solution is to recognize a predictive yet compressed subgraph to get rid of the noise and redundancy and obtain the interpretable part of the graph. This setting of subgraph is similar to the information bottleneck (IB) principle, which is less studied on graph-structured data and GCN. Inspired by the IB principle, we propose a novel subgraph information bottleneck (SIB) framework to recognize such subgraphs, named IB-subgraph. However, the intractability of mutual information and the discrete nature of graph data makes the objective of SIB notoriously hard to optimize. To this end, we introduce a bilevel optimization scheme coupled with a mutual information estimator for irregular graphs. Moreover, we propose a continuous relaxation for subgraph selection with a connectivity loss for stabilization. We further theoretically prove the error bound of our estimation scheme for mutual information and the noise-invariant nature of IB-subgraph. Extensive experiments on graph learning and large-scale point cloud tasks demonstrate the superior property of IB-subgraph.
翻译:图表革命网络(GCN)的出现极大地推动了图表学习的进展,然而,有两个令人不安的因素,即图表数据中的噪音和冗余,以及缺乏对预测结果的解释,阻碍了GCN的进一步发展。一个解决办法是承认一种预测而压缩的子图,以清除噪音和冗余,并获得图中可解释的部分。这种子图的设置类似于信息瓶颈原则(IB),该原则在图形结构数据和GCN上的研究较少。在IB原则的启发下,我们提出了一个新的子图信息瓶颈(SIB)框架,以识别这类子图,称为IB-Subb。然而,由于相互信息的可耐性和图表数据的离散性质,使得SIB的目标很难被臭名化地优化。为此,我们引入了双级优化计划,同时对不规则图形进行相互的信息估计。此外,我们建议继续放松对子图选择的次图选择,并造成连接损失以稳定。我们从理论上进一步证明我们关于相互信息的估计计划与高噪度的磁度测试IB大比例图任务之间的错误。