Teacher-student models provide a powerful framework in which the typical case performance of high-dimensional supervised learning tasks can be studied in closed form. In this setting, labels are assigned to data - often taken to be Gaussian i.i.d. - by a teacher model, and the goal is to characterise the typical performance of the student model in recovering the parameters that generated the labels. In this manuscript we discuss a generalisation of this setting where the teacher and student can act on different spaces, generated with fixed, but generic feature maps. This is achieved via the rigorous study of a high-dimensional Gaussian covariate model. Our contribution is two-fold: First, we prove a rigorous formula for the asymptotic training loss and generalisation error achieved by empirical risk minimization for this model. Second, we present a number of situations where the learning curve of the model captures the one of a \emph{realistic data set} learned with kernel regression and classification, with out-of-the-box feature maps such as random projections or scattering transforms, or with pre-learned ones - such as the features learned by training multi-layer neural networks. We discuss both the power and the limitations of the Gaussian teacher-student framework as a typical case analysis capturing learning curves as encountered in practice on real data sets.
翻译:教师- 学生模式提供了一个强大的框架, 通过这种框架, 能够以封闭的形式研究高层次监督教学任务的典型案例表现。 在这种环境下, 标签被指定给数据 - 通常被教师模式指定为高斯文i. id. - - 由教师模式指定, 目标是描述学生模式在恢复产生标签的参数方面的典型表现。 在这个手稿中, 我们讨论如何概括这一环境, 教师和学生可以在不同的空间上行动, 由固定但通用的地物图生成。 这是通过严格研究高斯文变异模型的高维度模型实现的。 我们的贡献是双重的: 首先, 我们证明对无症状培训损失和概括错误采用了严格的公式, 通过将经验风险降到最低的方式为这一模式所实现。 其次, 我们介绍了一些情况, 模型的学习曲线能够捕捉到通过内核回归和分类学得来的, 外框地图, 如随机的投影或分散变形变形模型, 或前变形模型 : 首先, 我们证明对无症状培训的模型进行严格公式,, 将 学习模型 的模型 的曲线, 如学习 的 的 。