Projections of bipartite or two-mode networks capture co-occurrences, and are used in diverse fields (e.g., ecology, economics, bibliometrics, politics) to represent unipartite networks that would otherwise be difficult or impossible to measure directly. A key challenge in analyzing such networks is determining whether an observed number of co-occurrences is significant. Several models now exist for doing so and thus for extracting the backbone of bipartite projections, but they have not been directly compared to each other. In this paper, we compare five such models -- fixed fill model (FFM) fixed row model (FRM), fixed column model (FCM), fixed degree sequence model (FDSM), and stochastic degree sequence model (SDSM) -- in terms of accuracy, speed, statistical power, similarity, and community detection. We find that the computationally-fast SDSM offers a statistically conservative but close approximation of the computationally-impractical FDSM under a wide range of conditions, and that it correctly recovers a known community structure even when the signal is weak. Therefore, although each backbone model may have particular applications, we recommend SDSM for extracting the backbone of most bipartite projections.
翻译:两极网络或双模式网络的预测显示共发现象,并用于不同领域(例如生态、经济学、双光度计、政治),以代表否则难以或无法直接计量的单方网络。分析这种网络的一个关键挑战是确定观察到的共同发现象数量是否重要。现在已有若干模型可以这样做,从而提取两极预测的骨干,但它们没有直接比较。在本文中,我们比较了五种模型 -- -- 固定填充模型(FFM)固定行模型(FRM)、固定柱模型(FCM)、固定度序列模型(FDSM)、固定度序列模型(SDSM),以及随机度序列模型(SDSSM) -- -- 在准确性、速度、统计能力、相似性和社区检测方面,这是关键度模型的关键挑战。我们发现,在多种条件下,快速计算SDSM在统计上较为保守,但与计算不精确的FDSMSM相近,而且它正确地恢复了已知的社区结构结构,即使信号是薄弱的,我们建议对SDM的每一个主基石模型都进行特定的预测。