The structure of many complex networks includes edge directionality and weights on top of their topology. Network analysis that can seamlessly consider combination of these properties are desirable. In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an information theoretic measure for computing centrality. Our principal contributions include a generalized model of Markov entropic centrality with the flexibility to tune the importance of node degrees, edge weights and directions, with a closed-form asymptotic analysis. It leads to a novel two-stage graph clustering algorithm. The centrality analysis helps reason about the suitability of our approach to cluster a given graph, and determine `query' nodes, around which to explore local community structures, leading to an agglomerative clustering mechanism. The entropic centrality computations are amortized by our clustering algorithm, making it computationally efficient: compared to prior approaches using Markov entropic centrality for clustering, our experiments demonstrate multiple orders of magnitude of speed-up. Our clustering algorithm naturally inherits the flexibility to accommodate edge directionality, as well as different interpretations and interplay between edge weights and node degrees. Overall, this paper thus not only makes significant theoretical and conceptual contributions, but also translates the findings into artifacts of practical relevance, yielding new, effective and scalable centrality computations and graph clustering algorithms, whose efficacy has been validated through extensive benchmarking experiments.
翻译:许多复杂网络的结构包括顶部的边缘方向性和权重。 能够无缝考虑这些属性组合的网络分析是可取的。 在本文中, 我们研究两种重要的网络分析技术, 即中心点和集群。 集群采用基于信息流的模型, 它本身就建立在计算中心点的信息理论性测量上。 我们的主要贡献包括马可夫的中枢的通用模型, 具有调和节点度、 边缘权重和方向重要性的灵活性, 并进行封闭式的广泛无序分析。 它导致一个新型的两阶段图形组合算法。 中心点分析有助于解释我们组合一个特定图形的方法是否合适, 并确定“ query” 节点, 围绕这个模型来探索本地社区结构, 从而形成一个聚合中心点机制。 我们的主要贡献包括马可夫的中枢中心点计算法, 与先前使用马可操作的集群核心点来调整重要性, 我们的实验显示了速度的多个级数级数。 我们的组合算法自然地继承了某个图表的基数的相关性,, 因此, 的基值 的底端端端端值 和正值 分析,,, 也使得 的基值 的 的 的 高度 的 的 的 的 的 的 的,, 的 的 的 高度 的 的,,,,,,,,,,,,,,,,,,,, 方向,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,