With the rapid expansion of graphs and networks and the growing magnitude of data from all areas of science, effective treatment and compression schemes of context-dependent data is extremely desirable. A particularly interesting direction is to compress the data while keeping the "structural information" only and ignoring the concrete labelings. Under this direction, Choi and Szpankowski introduced the structures (unlabeled graphs) which allowed them to compute the structural entropy of the Erd\H{o}s--R\'enyi random graph model. Moreover, they also provided an asymptotically optimal compression algorithm that (asymptotically) achieves this entropy limit and runs in expectation in linear time. In this paper, we consider the Stochastic Block Models with an arbitrary number of parts. Indeed, we define a partitioned structural entropy for Stochastic Block Models, which generalizes the structural entropy for unlabeled graphs and encodes the partition information as well. We then compute the partitioned structural entropy of the Stochastic Block Models, and provide a compression scheme that asymptotically achieves this entropy limit.
翻译:随着图表和网络的迅速扩展以及来自科学各个领域的数据的日益庞大,根据背景数据的有效处理和压缩计划是极为可取的。一个特别有趣的方向是压缩数据,同时只保留“结构信息”而忽略混凝土标签。在此方向下,Choi和Szpankowski引入了结构(无标签图),允许他们计算Erd\H{o}s-R\'enyi随机图模型的结构酶。此外,它们还提供了一种(暂时的)达到这种酶限值并在线性时间内运行的无症状最佳压缩算法。在本文中,我们考虑具有任意性部分的斯托切区模型。事实上,我们为斯托切克区模型定义了一种隔绝结构酶(无标签图),对分区信息进行普通化。我们随后对托切式模型的隔置结构酶模型进行了(隔置式)最佳算法,并提供了一种可实现这一限制的压缩方案。