Researchers have long observed that the "small-world" property, which combines the concepts of high transitivity or clustering with a low average path length, is ubiquitous for networks obtained from a variety of disciplines including social sciences, biology, neuroscience, and ecology. However, we find three shortcomings of the currently popular definition and detection methods rendering the concept less powerful. First, the classical definition combines high transitivity with a low average path length in a rather ad-hoc fashion which confounds the two separate aspects. We find that in several cases, networks get flagged as "small world" by the current methodology solely because of their high transitivity. Second, the detection methods lack a formal statistical inference, and third, the comparison is typically performed against simplistic random graph models as the baseline which ignores well-known network characteristics. We propose three innovations to address these issues. First, we decouple the properties of high transitivity and low average path length as separate events to test for. Second, we define the property as a statistical test between a suitable null model and a superimposed alternative model. Third, the test is performed using parametric bootstrap with several null models to allow a wide range of background structures in the network. In addition to the bootstrap tests, we also propose an asymptotic test under the Erd\"{o}s-Ren\'{y}i null model for which we provide theoretical guarantees on the asymptotic level and power. Applying the proposed methods on a large number of network datasets, we uncover new insights about their small-world property.
翻译:长期以来,研究人员发现,“ 小型世界” 属性将高度中转性或集群的概念与低平均路径长度结合起来,对于从社会科学、生物学、神经科学和生态等不同学科获得的网络来说,“ 小型世界” 属性无处不在。 然而,我们发现目前流行的定义和探测方法有三个缺点,使得这个概念的力量较弱。 首先,古典定义将高度中转性和低平均路径长度的特性混杂于两个不同的方面。 第二,我们发现在一些情况中,目前的方法只是因为其高度的中转性而将这些网络标为“小世界”。 其次,检测方法缺乏正式的统计推论,第三,通常以简单随机图形模型作为基准进行对比,这些模型忽视了众所周知的网络特征。 我们建议用三种创新方法来解决这些问题。 首先,我们将高中转性和低平均路径长度的特性混杂在一起,作为要测试的模型。 其次,我们把地产定义为一种适当的不完全模型和超叠的替代模型之间的统计测试。 第三,测试是在对一个对地层的深层靴进行模拟进行,在一系列的测试。