Recent works have demonstrated a double descent phenomenon in over-parameterized learning: as the number of model parameters increases, the excess risk has a $\mathsf{U}$-shape at beginning, then decreases again when the model is highly over-parameterized. Although this phenomenon has been investigated by recent works under different settings such as linear models, random feature models and kernel methods, it has not been fully understood in theory. In this paper, we consider a double random feature model (DRFM) consisting of two types of random features, and study the excess risk achieved by the DRFM in ridge regression. We calculate the precise limit of the excess risk under the high dimensional framework where the training sample size, the dimension of data, and the dimension of random features tend to infinity proportionally. Based on the calculation, we demonstrate that the risk curves of DRFMs can exhibit triple descent. We then provide an explanation of the triple descent phenomenon, and discuss how the ratio between random feature dimensions, the regularization parameter and the signal-to-noise ratio control the shape of the risk curves of DRFMs. At last, we extend our study to the multiple random feature model (MRFM), and show that MRFMs with $K$ types of random features may exhibit $(K+1)$-fold descent. Our analysis points out that risk curves with a specific number of descent generally exist in random feature based regression. Another interesting finding is that our result can recover the risk peak locations reported in the literature when learning neural networks are in the "neural tangent kernel" regime.
翻译:近期的作品在超参数化的学习中显示出一种双重下降现象:随着模型参数数量的增加,超额风险在开始时有美元=mathsf{U}$-shape,当模型高度超参数化时又会再次减少。虽然最近在不同情况下的工程,如线性模型、随机特征模型和内核方法,已经对这一现象进行了调查,但在理论上并没有完全理解。我们在此文件中,我们考虑一种由两种随机特征组成的双随机特征模型(DRFM),并研究DRFM在山脊下回归中取得的超额风险。我们计算了高维度框架下的超额风险的精确限度,在高维框架下,培训样本规模、数据尺寸和随机特征的大小往往具有不精确性。根据计算结果,我们证明DRFMMMs的风险曲线曲线在三重下降。我们然后解释随机特征、正规化参数和信号与信号比比控制 DRFMMs 的风险曲线的形状的形状。我们用RFMK 的随机模型来分析, 以RFMK 的滚动的底值为基础, 。