The goal of graph summarization is to represent large graphs in a structured and compact way. A graph summary based on equivalence classes preserves pre-defined features of a graph's vertex within a $k$-hop neighborhood such as the vertex labels and edge labels. Based on these neighborhood characteristics, the vertex is assigned to an equivalence class. The calculation of the assigned equivalence class must be a permutation invariant operation on the pre-defined features. This is achieved by sorting on the feature values, e. g., the edge labels, which is computationally expensive, and subsequently hashing the result. Graph Neural Networks (GNN) fulfill the permutation invariance requirement. We formulate the problem of graph summarization as a subgraph classification task on the root vertex of the $k$-hop neighborhood. We adapt different GNN architectures, both based on the popular message-passing protocol and alternative approaches, to perform the structural graph summarization task. We compare different GNNs with a standard multi-layer perceptron (MLP) and Bloom filter as non-neural method. For our experiments, we consider four popular graph summary models on a large web graph. This resembles challenging multi-class vertex classification tasks with the numbers of classes ranging from $576$ to multiple hundreds of thousands. Our results show that the performance of GNNs are close to each other. In three out of four experiments, the non-message-passing GraphMLP model outperforms the other GNNs. The performance of the standard MLP is extraordinary good, especially in the presence of many classes. Finally, the Bloom filter outperforms all neural architectures by a large margin, except for the dataset with the fewest number of $576$ classes.
翻译:图形总和的目标是以结构化和紧凑的方式代表大图。 基于等值类的图形摘要在 $k$-hop 邻里保留图形顶点的预定义特性, 如顶点标签和边缘标签。 基于这些邻里特性, 将顶点指定为等值类。 分配的等值类的计算必须是在预定义的特性上的异差操作。 这是通过对特性值的排序实现的, 例如, 边端标签, 计算费用昂贵, 并随后得出结果。 图表 Neurage 网络( GNNN) 满足了变异性化的变异性要求。 我们根据这些相邻的特性, 将图形总和化的问题指定为等值类。 我们根据流行的信息接收协议和替代方法将不同的 GNNNE 结构调整为多个结构化。 我们将不同的 GNNP 与标准的多级超值常值常值常值异常值差差差差差差差差( MNNNNW) 网络网络满足了变异性能性能性能性能, 并且将GNMNL 类的每个G- must 高压级的高级等距 也显示其他高压级。