Real networks often have severe degree heterogeneity, with the maximum, average, and minimum node degrees differing significantly. This paper examines the impact of degree heterogeneity on statistical limits of network data analysis. Introducing the heterogeneity distribution (HD) under a degree-corrected mixed-membership network model, we show that the optimal rate of mixed membership estimation is an explicit functional of the HD. This result confirms that severe degree heterogeneity may decelerate the error rate, even when the overall sparsity remains unchanged. To obtain a rate-optimal method, we modify an existing spectral algorithm, Mixed-SCORE, by adding a pre-PCA normalization step. This step normalizes the adjacency matrix by a diagonal matrix consisting of the $b$th power of node degrees, for some $b\in \mathbb{R}$. We discover that $b = 1/2$ is universally favorable. The resulting spectral algorithm is rate-optimal for networks with arbitrary degree heterogeneity. A technical component in our proofs is entry-wise eigenvector analysis of the normalized graph Laplacian.
翻译:暂无翻译