The spectra of random feature matrices provide essential information on the conditioning of the linear system used in random feature regression problems and are thus connected to the consistency and generalization of random feature models. Random feature matrices are asymmetric rectangular nonlinear matrices depending on two input variables, the data and the weights, which can make their characterization challenging. We consider two settings for the two input variables, either both are random variables or one is a random variable and the other is well-separated, i.e. there is a minimum distance between points. With conditions on the dimension, the complexity ratio, and the sampling variance, we show that the singular values of these matrices concentrate near their full expectation and near one with high-probability. In particular, since the dimension depends only on the logarithm of the number of random weights or the number of data points, our complexity bounds can be achieved even in moderate dimensions for many practical setting. The theoretical results are verified with numerical experiments.
翻译:随机特征矩阵的光谱提供了随机特征回归问题所用线性系统条件的基本信息,因此与随机特征模型的一致性和概括性相关。随机特征矩阵是不对称的矩形非线性矩阵,取决于两个输入变量,即数据和重量,这使得其特征定性具有挑战性。我们认为两个输入变量的两个设置,要么是随机变量,要么是随机变量,另一个是随机变量,另一个是完全分离的,即点之间的距离是最小的。在尺寸、复杂性比率和抽样差异等条件方面,我们显示这些矩阵的单值集中在它们完全预期的附近,而接近于高概率的。特别是,由于该维度仅取决于随机加权数的对数或数据点数的对数,因此许多实际环境中即使以中等的维度来达到我们的复杂性界限。理论结果通过数字实验得到验证。