In recent years, significant attention in deep learning theory has been devoted to analyzing the generalization performance of models with multiple layers of Gaussian random features. However, few works have considered the effect of feature anisotropy; most assume that features are generated using independent and identically distributed Gaussian weights. Here, we derive learning curves for models with many layers of structured Gaussian features. We show that allowing correlations between the rows of the first layer of features can aid generalization, while structure in later layers is generally detrimental. Our results shed light on how weight structure affects generalization in a simple class of solvable models.
翻译:近年来,深层学习理论对分析具有多层高斯随机特征的模型的通用性能给予了极大关注,然而,很少有作品考虑了地貌反异质特征的效果;多数作品假设地貌特征是使用独立和分布相同的高斯加权法生成的。在这里,我们为具有多层结构化高斯特征的模型得出学习曲线。我们表明,允许第一层地貌各行之间的相关性有助于概括化,而后层的结构则普遍有害。我们的结果揭示了重量结构如何影响简单一类可溶性模型的普遍化。</s>