We present SsAG, an efficient and scalable lossy graph summarization method that retains the essential structure of the original graph. SsAG computes a sparse representation (summary) of the input graph and also caters to graphs with node attributes. The summary of a graph $G$ is stored as a graph on supernodes (subsets of vertices of $G$), and a weighted superedge connects two supernodes. The proposed method constructs a summary graph on $k$ supernodes that minimize the reconstruction error (difference between the original graph and the graph reconstructed from the summary) and maximum homogeneity with respect to attributes. We construct the summary by iteratively merging a pair of nodes. We derive a closed-form expression to efficiently compute the reconstruction error after merging a pair and approximate this score in constant time. To reduce the search space for selecting the best pair for merging, we assign a weight to each supernode that closely quantifies the contribution of the node in the score of the pairs containing it. We choose the best pair for merging from a random sample of supernodes selected with probability proportional to their weights. A logarithmic-sized sample yields a comparable summary based on various quality measures with weighted sampling. We propose a sparsification step for the constructed summary to reduce the storage cost to a given target size with a marginal increase in reconstruction error. Empirical evaluation on several real-world graphs and comparison with state-of-the-art methods shows that SsAG is up to $5\times$ faster and generates summaries of comparable quality.
翻译:我们提出SSAG, 一种高效且可缩放的损失图形总和方法, 它可以保留原始图形的基本结构。 SSAG 计算输入图形的表达式( 总和) 稀少( 总和), 并且满足带有节点属性的图形。 图形$G$的概要存储为超节点的图表( 垂直值为$G$的子集), 加权的上层连接了两个超级节点 。 拟议的方法在 $k 的超级节点上构建了一个简图, 以尽可能减少重建错误( 原始图表和从摘要中重建的图表之间的差错), 并在属性方面实现最大一致。 我们通过迭接合并一对节点来构建摘要。 我们用一个封闭式表达方式来高效地配置重建错误, 在合并一对配对子后, 并在固定时间中大约地将这一评分点连接到两个双对子。 我们给每个超级节点设定一个重量的缩略图, 在包含该节点的对正点上, 我们选择一个比对底质量的比对子, 将一个比值比值比值比值比值比值比值的比值比值比值比值, 。