It has become an increasingly common practice for scientists in modern science and engineering to collect samples of multiple network data in which a network serves as a basic data object. The increasing prevalence of multiple network data calls for developments of models and theory that can deal with inference problems for populations of networks. In this work, we propose a general procedure for hypothesis testing of networks and in particular, for differentiating distributions of two samples of networks. We consider a very general framework which allows us to perform tests on large and sparse networks. Our contribution is two-fold: (1) We propose a test statistics based on the singular value of a generalized Wigner matrix. The asymptotic null distribution of the statistics is shown to follow the Tracy--Widom distribution as the number of nodes tends to infinity. The test also yields asymptotic power guarantee with the power tending to one under the alternative; (2) The test procedure is adapted for change-point detection in dynamic networks which is proven to be consistent in detecting the change-points. In addition to theoretical guarantees, another appealing feature of this adapted procedure is that it provides a principled and simple method for selecting the threshold that is also allowed to vary with time. Extensive simulation studies and real data analyses demonstrate the superior performance of our procedure with competitors.
翻译:现代科学和工程科学家越来越普遍的做法是收集多个网络数据样本,其中网络作为基本数据对象,收集多个网络数据样本,这是现代科学和工程科学家越来越普遍的做法,其中网络作为基本数据对象。多重网络数据日益普遍,这就要求发展能够解决网络人口推断问题的模型和理论。在这项工作中,我们提出了网络假设测试的一般程序,特别是区分两个网络样本的分布。我们认为,这是一个非常笼统的框架,使我们能够对大型和稀疏的网络进行测试。我们的贡献有两个方面:(1) 我们根据一个通用的维格纳矩阵的单值提出一个测试统计数据。统计数据的无足轻重分布表明它遵循Tracy-Widom的分布模式和理论,因为节点的分布往往是无限的。在试验中,还提出了一种假设性能力测试,保证了网络中两种样本的分布。我们认为,测试程序适用于动态网络的变更点检测,这在检测变化点时证明是一致的。除了理论保证外,这一调整程序的另一个具有吸引力的特征是,它提供了一种原则性和简单的方法,用以选择Tracy-Widersiming 并允许进行真实的数据模拟分析。