We introduce in this paper a new summarization method for large graphs. Our summarization approach retains only a user-specified proportion of the neighbors of each node in the graph. Our main aim is to simplify large graphs so that they can be analyzed and processed effectively while preserving as many of the node neighborhood properties as possible. Since many graph algorithms are based on the neighborhood information available for each node, the idea is to produce a smaller graph which can be used to allow these algorithms to handle large graphs and run faster while providing good approximations. Moreover, our compression allows users to control the size of the compressed graph by adjusting the amount of information loss that can be tolerated. The experiments conducted on various real and synthetic graphs show that our compression reduces considerably the size of the graphs. Moreover, we conducted several experiments on the obtained summaries using various graph algorithms and applications, such as node embedding, graph classification and shortest path approximations. The obtained results show interesting trade-offs between the algorithms runtime speed-up and the precision loss.
翻译:我们在本文中为大图表引入新的汇总方法。 我们的汇总方法只保留图表中每个节点周围的用户指定比例。 我们的主要目的是简化大图表, 以便在尽可能保存多个节点周围属性的同时, 有效地分析和处理大图表。 由于许多图表算法基于每个节点的周边信息, 我们的想法是制作一个较小的图表, 以便使用这些算法处理大图表并更快运行, 同时提供良好的近似值。 此外, 我们的压缩允许用户通过调整可容忍的信息损失量来控制压缩图的大小。 在各种真实和合成图表上进行的实验显示, 我们的压缩大大缩小了图表的大小。 此外, 我们利用各种图表算法和应用, 如节点嵌入、 图形分类和 最短路径近似等, 对所获得的摘要进行了数项实验。 所获得的结果显示, 算法的运行速度和精确损失之间存在有趣的权衡。