Real-world graphs can be difficult to interpret and visualize beyond a certain size. To address this issue, graph summarization aims to simplify and shrink a graph, while maintaining its high-level structure and characteristics. Most summarization methods are designed for homogeneous, undirected, simple graphs; however, many real-world graphs are ornate; with characteristics including node labels, directed edges, edge multiplicities, and self-loops. In this paper we propose LM-Gsum, a versatile yet rigorous graph summarization model that (to the best of our knowledge, for the first time) can handle graphs with all the aforementioned characteristics (and any combination thereof). Moreover, our proposed model captures basic sub-structures that are prevalent in real-world graphs, such as cliques, stars, etc. LM-Gsum compactly quantifies the information content of a complex graph using a novel encoding scheme, where it seeks to minimize the total number of bits required to encode (i) the summary graph, as well as (ii) the corrections required for reconstructing the input graph losslessly. To accelerate the summary construction, it creates super-nodes efficiently by merging nodes in groups. Experiments demonstrate that LM-Gsum facilitates the visualization of real-world complex graphs, revealing interpretable structures and high- level relationships. Furthermore, LM-Gsum achieves better trade-off between compression rate and running time, relative to existing methods (only) on comparable settings.
翻译:现实世界的图形可能很难解释和想象出超过一定大小的图像。 要解决这个问题, 图形总和旨在简化和缩略图, 同时保持其高层次的结构和特性。 大部分的图形总化方法都是为单一、 没有方向的简单图形设计的; 但是, 许多真实世界的图形都是圆形的; 其特性包括节点标签、 定向边缘、 边缘多功能 和自滑。 在本文中, 我们提议LM- Gsum, 一个多功能而严格的图形总和模型, (根据我们的知识, 第一次) 能够处理具有上述所有特性( 及其任何组合) 的图表。 此外, 我们提议的模型是针对现实世界的图形中普遍存在的基本子结构, 如 cliques、 恒星等 。 LM- Gsum 精密地量化一个复杂图表的信息内容, 使用一种新编码方法, 试图将编码所需的百分数减少到最小 (i) 概要图, 以及 (ii) 创建具有上述所有可比性的图表( 和任何组合 ) 可比的( ) 可比的缩略度的图表结构的校正的校正的校正的缩缩缩缩缩图关系 。 加速的缩略L- L- 的缩略的造的缩图- 和G- 的缩略的图形的缩略图的造的缩略图的缩略图的缩略图。