Translated title: 在线学习在学生-教师框架中的随机特征模型中的应用 Translated abstract: 深度神经网络是广泛用于预测的算法，其性能通常会随着权重数量的增加而提高，导致过度参数化。我们考虑一种两层神经网络，其第一层被冻结，而最后一层可训练，被称为随机特征模型。我们通过推导学习动态的一组微分方程来研究过度参数化在学生-教师框架下的情况。对于任何有限的隐藏层大小与输入维数之比，学生不能完美地泛化，我们计算了非零渐近泛化误差。只有当学生的隐藏层大小指数级地大于输入维数时，才可能实现完美的泛化。 (Online Learning for the Random Feature Model in the Student-Teacher Framework) - 专知论文

会员服务 ·

0

特征模 · 特征模型 · 过度参数化 · 隐藏层 · 参数化 ·

2023 年 3 月 24 日

Online Learning for the Random Feature Model in the Student-Teacher Framework

翻译：Translated title: 在线学习在学生-教师框架中的随机特征模型中的应用 Translated abstract: 深度神经网络是广泛用于预测的算法，其性能通常会随着权重数量的增加而提高，导致过度参数化。我们考虑一种两层神经网络，其第一层被冻结，而最后一层可训练，被称为随机特征模型。我们通过推导学习动态的一组微分方程来研究过度参数化在学生-教师框架下的情况。对于任何有限的隐藏层大小与输入维数之比，学生不能完美地泛化，我们计算了非零渐近泛化误差。只有当学生的隐藏层大小指数级地大于输入维数时，才可能实现完美的泛化。

Roman Worschech,Bernd Rosenow

Deep neural networks are widely used prediction algorithms whose performance often improves as the number of weights increases, leading to over-parametrization. We consider a two-layered neural network whose first layer is frozen while the last layer is trainable, known as the random feature model. We study over-parametrization in the context of a student-teacher framework by deriving a set of differential equations for the learning dynamics. For any finite ratio of hidden layer size and input dimension, the student cannot generalize perfectly, and we compute the non-zero asymptotic generalization error. Only when the student's hidden layer size is exponentially larger than the input dimension, an approach to perfect generalization is possible.

翻译：

0

相关内容

特征模

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【干货书】深度学习数学：理解神经网络，347页pdf

【干货书】深度学习数学：理解神经网络，347页pdf

专知会员服务

267+阅读 · 2022年7月3日

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【CVPR2020-清华大学】具有后验置信度的噪声数据的概率视频预测

【CVPR2020-清华大学】具有后验置信度的噪声数据的概率视频预测

专知会员服务

20+阅读 · 2020年4月4日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

图与推荐

3+阅读 · 2022年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

如何找到最优学习率？

如何找到最优学习率？

AI研习社

11+阅读 · 2017年11月29日

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

带变动指标集的非光滑半无限优化问题的最优性条件研究

国家自然科学基金

0+阅读 · 2015年12月31日

应变梯度对铁电材料力电耦合性能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于三角小波有限元的桥梁稳定极限承载力自适应计算理论研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应混合代理模型的功能梯度泡沫填充多胞结构的耐撞性优化设计理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

一些q-特殊函数的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于三元粗糙输出编码的带自适应惩罚因子的支持向量机多分类模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

在相干态表象中求解非平衡多体系统量子主方程

国家自然科学基金

0+阅读 · 2012年12月31日

计算资源受限情况下视频编码新标准HEVC的关键优化问题研究

国家自然科学基金

0+阅读 · 2011年12月31日

Accelerated Algorithms for Nonlinear Matrix Decomposition with the ReLU function

Arxiv

0+阅读 · 2023年5月15日

Introduction to dynamical mean-field theory of generic random neural networks

Arxiv

0+阅读 · 2023年5月15日

DNN-Defender: An in-DRAM Deep Neural Network Defense Mechanism for Adversarial Weight Attack

Arxiv

0+阅读 · 2023年5月14日

Efficient Dynamic Allocation Policy for Robust Ranking and Selection under Stochastic Control Framework

Arxiv

0+阅读 · 2023年5月12日

Online Learning Under A Separable Stochastic Approximation Framework

Arxiv

0+阅读 · 2023年5月12日

Inverse wave-number-dependent source problems for the Helmholtz equation

Arxiv

0+阅读 · 2023年5月12日

Sequential model correction for nonlinear inverse problems

Arxiv

0+阅读 · 2023年5月12日

Random Smoothing Regularization in Kernel Gradient Descent Learning

Arxiv

0+阅读 · 2023年5月12日

High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

Arxiv

0+阅读 · 2023年5月11日

Synthetic data generation method for data-free knowledge distillation in regression neural networks

Arxiv

0+阅读 · 2023年5月10日

VIP会员

文章信息

相关主题

过度参数化

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【干货书】深度学习数学：理解神经网络，347页pdf

【干货书】深度学习数学：理解神经网络，347页pdf

专知会员服务

267+阅读 · 2022年7月3日

【ICML2021】无训练神经架构搜索

专知会员服务

20+阅读 · 2021年9月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【CVPR2020-清华大学】具有后验置信度的噪声数据的概率视频预测

【CVPR2020-清华大学】具有后验置信度的噪声数据的概率视频预测

专知会员服务

20+阅读 · 2020年4月4日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

NeurIPS'22上的GNN好文集合 (表示能力、架构设计、图对比/自监督学习、分布偏移、可解释、推荐系统等)

图与推荐

3+阅读 · 2022年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【CNN】一文读懂卷积神经网络CNN

【CNN】一文读懂卷积神经网络CNN

产业智能官

18+阅读 · 2018年1月2日

如何找到最优学习率？

如何找到最优学习率？

AI研习社

11+阅读 · 2017年11月29日

相关论文

Accelerated Algorithms for Nonlinear Matrix Decomposition with the ReLU function

Arxiv

0+阅读 · 2023年5月15日

Introduction to dynamical mean-field theory of generic random neural networks

Arxiv

0+阅读 · 2023年5月15日

DNN-Defender: An in-DRAM Deep Neural Network Defense Mechanism for Adversarial Weight Attack

Arxiv

0+阅读 · 2023年5月14日

Efficient Dynamic Allocation Policy for Robust Ranking and Selection under Stochastic Control Framework

Arxiv

0+阅读 · 2023年5月12日

Online Learning Under A Separable Stochastic Approximation Framework

Arxiv

0+阅读 · 2023年5月12日

Inverse wave-number-dependent source problems for the Helmholtz equation

Arxiv

0+阅读 · 2023年5月12日

Sequential model correction for nonlinear inverse problems

Arxiv

0+阅读 · 2023年5月12日

Random Smoothing Regularization in Kernel Gradient Descent Learning

Arxiv

0+阅读 · 2023年5月12日

High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

Arxiv

0+阅读 · 2023年5月11日

Synthetic data generation method for data-free knowledge distillation in regression neural networks

Arxiv

0+阅读 · 2023年5月10日

相关基金

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

带变动指标集的非光滑半无限优化问题的最优性条件研究

国家自然科学基金

0+阅读 · 2015年12月31日

应变梯度对铁电材料力电耦合性能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

几类扩散过程的逼近及应用

国家自然科学基金

1+阅读 · 2014年12月31日

基于三角小波有限元的桥梁稳定极限承载力自适应计算理论研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应混合代理模型的功能梯度泡沫填充多胞结构的耐撞性优化设计理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

一些q-特殊函数的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于三元粗糙输出编码的带自适应惩罚因子的支持向量机多分类模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

在相干态表象中求解非平衡多体系统量子主方程

国家自然科学基金

0+阅读 · 2012年12月31日

计算资源受限情况下视频编码新标准HEVC的关键优化问题研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员