Hypergraphs are a natural modeling paradigm for a wide range of complex relational systems. A standard analysis task is to identify clusters of closely related or densely interconnected nodes. Many graph algorithms for this task are based on variants of the stochastic blockmodel, a random graph with flexible cluster structure. However, there are few models and algorithms for hypergraph clustering. Here, we propose a Poisson degree-corrected hypergraph stochastic blockmodel (DCHSBM), a generative model of clustered hypergraphs with heterogeneous node degrees and edge sizes. Approximate maximum-likelihood inference in the DCHSBM naturally leads to a clustering objective that generalizes the popular modularity objective for graphs. We derive a general Louvain-type algorithm for this objective, as well as a a faster, specialized "All-Or-Nothing" (AON) variant in which edges are expected to lie fully within clusters. This special case encompasses a recent proposal for modularity in hypergraphs, while also incorporating flexible resolution and edge-size parameters. We show that AON hypergraph Louvain is highly scalable, including as an example an experiment on a synthetic hypergraph of one million nodes. We also demonstrate through synthetic experiments that the detectability regimes for hypergraph community detection differ from methods based on dyadic graph projections. We use our generative model to analyze different patterns of higher-order structure in school contact networks, U.S. congressional bill cosponsorship, U.S. congressional committees, product categories in co-purchasing behavior, and hotel locations from web browsing sessions, finding interpretable higher-order structure. We then study the behavior of our AON hypergraph Louvain algorithm, finding that it is able to recover ground truth clusters in empirical data sets exhibiting the corresponding higher-order structure.
翻译:测高仪是一系列复杂关系系统的自然模型。 标准分析的任务是确定紧密关联或紧密关联节点的群集。 许多任务图形算法基于随机图的变体, 是一个具有灵活群集结构的随机图。 然而, 高光集的模型和算法很少。 这里, 我们提议了一个 Poisson 度校正高压轮廓模型( DCHSBM ), 是一个具有不同节点度和边缘大小的集束超强模型。 在 DCHSBM 中, 近似于最接近的高层点推断自然导致一个组群集目标, 将通用模块模块模型的模块模型模型( 随机图集模型) 。 我们为这个目标制作了一个通用的Louvain 型通用算法, 以及一个更快、 专门、 专门化的“ 全 Or- 无害” (AON) 模型, 其边缘将完全置在集群内。 这个特别的模型包含最近提出的模块组合, 并且包含灵活的分辨率和边缘参数。 我们的高级直径直径直径直系的直径直径直径直径直径直径直径直径直径直系的直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径直径,,,,, 我们的直径直径直径直径直径直径直径直径直地, 。