变化变化-受训练深神经网络的趋同率 (The Rate of Convergence of Variation-Constrained Deep Neural Networks) - 专知论文

会员服务 ·

0

Networking · Neural Networks · 前馈网络 · 近似 · 均方误差 ·

2021 年 6 月 22 日

The Rate of Convergence of Variation-Constrained Deep Neural Networks

翻译：变化变化-受训练深神经网络的趋同率

Gen Li,Yuantao Gu,Jie Ding

Multi-layer feedforward networks have been used to approximate a wide range of nonlinear functions. An important and fundamental problem is to understand the learnability of a network model through its statistical risk, or the expected prediction error on future data. To the best of our knowledge, the rate of convergence of neural networks shown by existing works is bounded by at most the order of $n^{-1/4}$ for a sample size of $n$. In this paper, we show that a class of variation-constrained neural networks, with arbitrary width, can achieve near-parametric rate $n^{-1/2+\delta}$ for an arbitrarily small positive constant $\delta$. It is equivalent to $n^{-1 +2\delta}$ under the mean squared error. This rate is also observed by numerical experiments. The result indicates that the neural function space needed for approximating smooth functions may not be as large as what is often perceived. Our result also provides insight to the phenomena that deep neural networks do not easily suffer from overfitting when the number of neurons and learning parameters rapidly grow with $n$ or even surpass $n$. We also discuss the rate of convergence regarding other network parameters, including the input dimension, network layer, and coefficient norm.

翻译：多层向前进网络已被用来估计广泛的非线性功能。一个重要和根本的问题是了解网络模型通过其统计风险或未来数据的预期预测错误的可学习性。据我们所知,现有工程显示的神经网络汇合率最多受一个样本规模为$n美元左右的约1/4美元左右。在本文中,我们表明,一组变化限制的神经网络和任意宽度的任意宽度可以达到一个任意的微小正数常数$+1/2 ⁇ delta}的近参数率。这相当于平均平方差下的$1+2\delta}美元。这个速率也通过数字实验观察。结果显示,控制光滑功能所需的神经功能空间可能并不象人们经常看到的那样大。我们的结果还揭示了一种现象,即当神经和学习参数与美元或甚至超过美元标准值的网络输入率迅速增长时,深神经和学习参数的数量不会轻易地因过高而受到影响,包括以美元或甚至超过美元标准值的网络输入率。

0

相关内容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CMU】最新深度学习课程， Introduction to Deep Learning

【CMU】最新深度学习课程， Introduction to Deep Learning

专知会员服务

38+阅读 · 2020年9月12日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

4+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

Arxiv

0+阅读 · 2021年8月25日

SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function

Arxiv

0+阅读 · 2021年8月25日

On the approximation of functions by tanh neural networks

Arxiv

0+阅读 · 2021年8月23日

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Arxiv

0+阅读 · 2021年8月20日

An efficient nonlinear solver and convergence analysis for a viscoplastic flow model

Arxiv

0+阅读 · 2021年8月19日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Fundamental Tradeoffs in Distributionally Adversarial Training

Arxiv

9+阅读 · 2021年1月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【CMU】最新深度学习课程， Introduction to Deep Learning

【CMU】最新深度学习课程， Introduction to Deep Learning

专知会员服务

38+阅读 · 2020年9月12日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型智能体强化学习：全景综述

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

【伯克利博士论文】从推理服务到训练：面向大规模 LLM 智能体的高效系统

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

相关资讯

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

4+阅读 · 2018年6月1日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

Arxiv

0+阅读 · 2021年8月25日

SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function

Arxiv

0+阅读 · 2021年8月25日

On the approximation of functions by tanh neural networks

Arxiv

0+阅读 · 2021年8月23日

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Arxiv

0+阅读 · 2021年8月20日

An efficient nonlinear solver and convergence analysis for a viscoplastic flow model

Arxiv

0+阅读 · 2021年8月19日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Fundamental Tradeoffs in Distributionally Adversarial Training

Arxiv

9+阅读 · 2021年1月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

微信扫码咨询专知VIP会员