Community detection refers to the problem of clustering the nodes of a network into groups. Existing inferential methods for community structure mainly focus on unweighted (binary) networks. Many real-world networks are nonetheless weighted and a common practice is to dichotomize a weighted network to an unweighted one which is known to result in information loss. Literature on hypothesis testing in the latter situation is still missing. In this paper, we study the problem of testing the existence of community structure in weighted networks. Our contributions are threefold: (a). We use the (possibly infinite-dimensional) exponential family to model the weights and derive the sharp information-theoretic limit for the existence of consistent test. Within the limit, any test is inconsistent; and beyond the limit, we propose a useful consistent test. (b). Based on the information-theoretic limits, we provide the first formal way to quantify the loss of information incurred by dichotomizing weighted graphs into unweighted graphs in the context of hypothesis testing. (c). We propose several new and practically useful test statistics. Simulation study show that the proposed tests have good performance. Finally, we apply the proposed tests to an animal social network.
翻译:社区检测是指将网络的节点分组的问题。现有的社区结构的推断方法主要侧重于未加权(二元)网络。许多现实世界网络是加权的,通常的做法是将加权网络分化为已知导致信息损失的未加权网络。在后一种情况下,关于假设测试的文献仍然缺失。在本文中,我们研究了在加权网络中测试社区结构存在的问题。我们的贡献有三重:(a)我们使用(可能无限的无限度)指数式家庭模拟重量,并得出存在一致测试的尖锐信息理论限制。在限度内,任何测试都是不一致的;在限度以外,我们提出一个有用的一致测试。(b)根据信息理论限度,我们提供了第一个正式的方法,在假设测试中将加权图形分解成未加权的图表而导致的信息损失进行量化。(c)我们提出了几项新的和实用的测试统计。模拟研究表明,拟议的测试具有良好的社会性能。最后,我们用一个正式的方法将拟议中的测试用于动物网络。我们提出了一种测试。我们提出了一种测试。