We study the problem of distributed estimation of the leading singular vectors for a collection of matrices with shared invariant subspaces. In particular we consider an algorithm that first estimates the projection matrices corresponding to the leading singular vectors for each individual matrix, then computes the average of the projection matrices, and finally returns the leading eigenvectors of the sample averages. We show that the algorithm, when applied to (1) parameters estimation for a collection of independent edge random graphs with shared singular vectors but possibly heterogeneous edge probabilities or (2) distributed PCA for independent sub-Gaussian random vectors with spiked covariance structure, yields estimates whose row-wise fluctuations are normally distributed around the rows of the true singular vectors. Leveraging these results we also consider a two-sample test for the null hypothesis that a pair of random graphs have the same edge probabilities and we present a test statistic whose limiting distribution converges to a central (resp. non-central) $\chi^2$ under the null (resp. local alternative) hypothesis.
翻译:暂无翻译