This paper aims to address two fundamental challenges arising in eigenvector estimation and inference for a low-rank matrix from noisy observations: (1) how to estimate an unknown eigenvector when the eigen-gap (i.e. the spacing between the associated eigenvalue and the rest of the spectrum) is particularly small; (2) how to perform estimation and inference on linear functionals of an eigenvector -- a sort of "fine-grained" statistical reasoning that goes far beyond the usual $\ell_2$ analysis. We investigate how to address these challenges in a setting where the unknown $n\times n$ matrix is symmetric and the additive noise matrix contains independent (and non-symmetric) entries. Based on eigen-decomposition of the asymmetric data matrix, we propose estimation and uncertainty quantification procedures for an unknown eigenvector, which further allow us to reason about linear functionals of an unknown eigenvector. The proposed procedures and the accompanying theory enjoy several important features: (1) distribution-free (i.e. prior knowledge about the noise distributions is not needed); (2) adaptive to heteroscedastic noise; (3) minimax optimal under Gaussian noise. Along the way, we establish optimal procedures to construct confidence intervals for the unknown eigenvalues. All this is guaranteed even in the presence of a small eigen-gap (up to $O(\sqrt{n/\mathrm{poly}\log (n)})$ times smaller than the requirement in prior theory), which goes significantly beyond what generic matrix perturbation theory has to offer.
翻译:本文旨在解决因杂音观测产生的低位基质估算和推算中出现的两个基本挑战:(1) 当 egen- gap (即相关egenvaly与光谱其他部分之间的间距) 特别小时,如何估算未知的egen- gap (即相关egen值与光谱其他部分之间的间距) 时,如何估算未知的egen- gap (即相关egen- gap 和光谱其他部分之间的间距) ;(2) 如何估算和推断一个未知的egen- genter 的线性功能 -- 一种远远超出通常的 $/ ell_ 2 的统计推理。 我们研究如何在这样的环境中应对这些挑战: 未知的 $/ potimmal oral 矩阵具有较弱的分布要求, 先前的关于On/ dentn yalx 的噪音矩阵包含独立的(和非对称度) 。 我们建议为未知的egen- gal- gal- sal- sal ormal 提供最优级的排序。