Recent works have demonstrated a double descent phenomenon in over-parameterized learning. Although this phenomenon has been investigated by recent works, it has not been fully understood in theory. In this paper, we consider a double random feature model (DRFM) which is the concatenation of two types of random features, and study the excess risk achieved by the DRFM in ridge regression. We calculate the precise limit of the excess risk under the high dimensional framework where the training sample size, the dimension of data, and the dimension of random features tend to infinity proportionally. Based on the calculation, we further theoretically demonstrate that the risk curves of DRFMs can exhibit triple descent. We then provide a thorough experimental study to verify our theory. At last, we extend our study to the multiple random feature model (MRFM), and show that MRFMs ensembling $K$ types of random features may exhibit $(K+1)$-fold descent. Our analysis points out that risk curves with a specific number of descent generally exist in random feature learning and ensemble learning with feature concatenation. Another interesting finding is that our result can help understand the risk peak locations reported in the literature when learning neural networks in the "neural tangent kernel" regime.
翻译:最近的著作在过度参数化的学习中显示了双重下降现象。虽然最近的工作对这种现象进行了调查,但从理论上说,它并没有被完全理解到。在本文中,我们考虑的是双重随机特征模型(DRFM),这是两种随机特征的结合,研究DRFM在脊脊回归中取得的超大风险。我们计算了高维框架下超大风险的精确限度,在高维框架内,培训样本规模、数据尺寸和随机特征的大小往往成比例的无限化。根据计算,我们进一步从理论上表明DRFMS的风险曲线可以显示三度下降。我们随后提供了彻底的实验性研究,以核实我们的理论。最后,我们将研究扩大到多类型随机特征模型(MRFMFM),并表明RMFMs包含美元类型的随机特征,可能显示$(K+1)美元-倍的下降。我们的分析指出,随机特征学习中通常存在特定数量的风险曲线,并且与特征共解。另一个有趣的发现是,我们的结果可以帮助学习内壳系统的最高风险位置。