We investigate random matrices whose entries are obtained by applying a nonlinear kernel function to pairwise inner products between $n$ independent data vectors, drawn uniformly from the unit sphere in $\mathbb{R}^d$. This study is motivated by applications in machine learning and statistics, where these kernel random matrices and their spectral properties play significant roles. We establish the weak limit of the empirical spectral distribution of these matrices in a polynomial scaling regime, where $d, n \to \infty$ such that $n / d^\ell \to \kappa$, for some fixed $\ell \in \mathbb{N}$ and $\kappa \in (0, \infty)$. Our findings generalize an earlier result by Cheng and Singer, who examined the same model in the linear scaling regime (with $\ell = 1$). Our work reveals an equivalence principle: the spectrum of the random kernel matrix is asymptotically equivalent to that of a simpler matrix model, constructed as a linear combination of a (shifted) Wishart matrix and an independent matrix sampled from the Gaussian orthogonal ensemble. The aspect ratio of the Wishart matrix and the coefficients of the linear combination are determined by $\ell$ and the expansion of the kernel function in the orthogonal Hermite polynomial basis. Consequently, the limiting spectrum of the random kernel matrix can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law. We also extend our results to cases with data vectors sampled from isotropic Gaussian distributions instead of spherical distributions.
翻译:暂无翻译