Spectral methods have emerged as a simple yet surprisingly effective approach for extracting information from massive, noisy and incomplete data. In a nutshell, spectral methods refer to a collection of algorithms built upon the eigenvalues (resp. singular values) and eigenvectors (resp. singular vectors) of some properly designed matrices constructed from data. A diverse array of applications have been found in machine learning, data science, and signal processing. Due to their simplicity and effectiveness, spectral methods are not only used as a stand-alone estimator, but also frequently employed to initialize other more sophisticated algorithms to improve performance. While the studies of spectral methods can be traced back to classical matrix perturbation theory and methods of moments, the past decade has witnessed tremendous theoretical advances in demystifying their efficacy through the lens of statistical modeling, with the aid of non-asymptotic random matrix theory. This monograph aims to present a systematic, comprehensive, yet accessible introduction to spectral methods from a modern statistical perspective, highlighting their algorithmic implications in diverse large-scale applications. In particular, our exposition gravitates around several central questions that span various applications: how to characterize the sample efficiency of spectral methods in reaching a target level of statistical accuracy, and how to assess their stability in the face of random noise, missing data, and adversarial corruptions? In addition to conventional $\ell_2$ perturbation analysis, we present a systematic $\ell_{\infty}$ and $\ell_{2,\infty}$ perturbation theory for eigenspace and singular subspaces, which has only recently become available owing to a powerful "leave-one-out" analysis framework.
翻译:光谱方法已经成为一种简单而令人惊讶的有效方法,从大规模、吵闹和不完全的数据中提取信息。 简而言之,光谱方法是指基于从数据中构建的某种设计得当的矩阵的算法( 复制单值) 和亚源体( 复制单向矢量) 。 在机器学习、 数据科学和信号处理中发现了多种应用。 由于其简单而有效, 光谱方法不仅被用作独立的估量器, 而且还经常用于初始化其他更尖端的算法, 以提高性能。 光谱方法的研究可以追溯到古典矩阵穿透性理论和瞬间方法。 过去十年里, 通过统计模型的透镜, 发现了各种应用的理论性进展。 由于这些应用的简单和效果, 光谱方法不仅从现代的统计角度, 以系统化、 全面、 也可以获取的光谱方法, 并且通过不同规模的基值应用来突出其算法影响。 具体地说, 我们的光谱分析是如何在常规数据分析中, 如何将数据定位到最近, 的基质的基级分析,, 如何在一系列的基质分析中,, 数据分析, 如何在一系列的基质分析中, 数据分析, 如何 如何,, 走向到 直基级分析, 直到 如何, 直位 直到 直到 直位 数据, 数据 直系 数据 。