Low-rank matrices are pervasive throughout statistics, machine learning, signal processing, optimization, and applied mathematics. In this paper, we propose a novel and user-friendly Euclidean representation framework for low-rank matrices. Correspondingly, we establish a collection of technical and theoretical tools for analyzing the intrinsic perturbation of low-rank matrices in which the underlying referential matrix and the perturbed matrix both live on the same low-rank matrix manifold. Our analyses show that, locally around the referential matrix, the sine-theta distance between subspaces is equivalent to the Euclidean distance between two appropriately selected orthonormal basis, circumventing the orthogonal Procrustes analysis. We also establish the regularity of the proposed Euclidean representation function, which has a profound statistical impact and a meaningful geometric interpretation. These technical devices are applicable to a broad range of statistical problems. Specific applications considered in detail include Bayesian sparse spiked covariance model with non-intrinsic loss, efficient estimation in stochastic block models where the block probability matrix may be degenerate, and least-squares estimation in biclustering problems. Both the intrinsic perturbation analysis of low-rank matrices and the regularity theorem may be of independent interest.
翻译:低位矩阵遍及所有统计、机器学习、信号处理、优化和应用数学。在本文中,我们提议为低位矩阵建立一个新颖和方便用户的欧几里德代表框架。相应地,我们建立一套技术和理论工具,用于分析低位矩阵的内在扰动,其中基础的优惠矩阵和环形矩阵都生活在相同的低位矩阵中。我们的分析显示,在优惠矩阵周围,亚空间之间的正中距离相当于欧洲里德(Euclide)距离,即两个适当选定的正态基点之间,绕过正方位剖面分析。我们还建立了拟议的欧几里德模式的常规功能,具有深刻的统计影响和有意义的几何解释。这些技术装置适用于一系列广泛的低位矩阵问题。我们详细考虑的具体应用包括:Bayesian 稀薄的峰值共差变差模型,具有非惯性损失,高效估算区块模型,其中块性概率矩阵可能退化,而内部基质分析可能最不透明。