在轻度超分力超强神经网络中实现小试验错误 (Achieving Small Test Error in Mildly Overparameterized Neural Networks) - 专知论文

会员服务 ·

0

测试误差 · 泛化理论 · Networking · Neural Networks · 优化器 ·

2021 年 4 月 24 日

Achieving Small Test Error in Mildly Overparameterized Neural Networks

翻译：在轻度超分力超强神经网络中实现小试验错误

Shiyu Liang,Ruoyu Sun,R. Srikant

Recent theoretical works on over-parameterized neural nets have focused on two aspects: optimization and generalization. Many existing works that study optimization and generalization together are based on neural tangent kernel and require a very large width. In this work, we are interested in the following question: for a binary classification problem with two-layer mildly over-parameterized ReLU network, can we find a point with small test error in polynomial time? We first show that the landscape of loss functions with explicit regularization has the following property: all local minima and certain other points which are only stationary in certain directions achieve small test error. We then prove that for convolutional neural nets, there is an algorithm which finds one of these points in polynomial time (in the input dimension and the number of data points). In addition, we prove that for a fully connected neural net, with an additional assumption on the data distribution, there is a polynomial time algorithm.

翻译：最近关于超参数神经网的理论工作集中在两个方面:优化和概括化。许多研究优化和一般化的现有工作都以神经相近内核为基础,需要非常宽的宽度。在这项工作中,我们感兴趣的问题是:对于两层轻度超分的ReLU网络的二进制分类问题,我们能否找到一个点,在多元时间里,这个点有小的测试错误?我们首先显示,具有明确规范化的丧失功能的景观具有以下属性:所有本地微型和某些仅固定在特定方向的其他点都具有小测试错误。我们随后证明,对于同源神经网,有一种算法在多元时间(投入层面和数据点数)中找到这些点之一。此外,我们证明,对于完全连接的神经网,除了对数据分布的附加假设外,还有一个多元时间算法。

0

相关内容

测试误差

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

专知会员服务

92+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

专知会员服务

46+阅读 · 2020年2月23日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

专知会员服务

92+阅读 · 2020年1月13日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

将门创投

7+阅读 · 2018年11月5日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Soft-NMS – Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

统计学习与视觉计算组

6+阅读 · 2018年3月30日

On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks

Arxiv

0+阅读 · 2021年6月15日

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

Arxiv

0+阅读 · 2021年6月14日

Double-descent curves in neural networks: a new perspective using Gaussian processes

Arxiv

0+阅读 · 2021年6月13日

Nonparametric Learning of Two-Layer ReLU Residual Units

Nonparametric Learning of Two-Layer ReLU Residual Units

Arxiv

0+阅读 · 2021年6月11日

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Arxiv

1+阅读 · 2021年6月11日

Double Descent and Other Interpolation Phenomena in GANs

Arxiv

0+阅读 · 2021年6月7日

Orthogonal Over-Parameterized Training

Arxiv

0+阅读 · 2021年6月5日

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks

Arxiv

0+阅读 · 2021年6月4日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

来自Fariz Darari博士的一份简明《神经网络与深度学习》的讲义，64页ppt

专知会员服务

92+阅读 · 2020年5月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

专知会员服务

46+阅读 · 2020年2月23日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

深度卷积神经网络的最新架构综述，A Survey of the Recent Architectures of Deep Convolutional Neural Networks

专知会员服务

48+阅读 · 2020年2月15日

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

【新书】MATLAB深度学习与机器学习、神经网络和人工智能（MATLAB Deep Learning With Machine Learning, Neural Networks and Artificial Intelligence），162页pdf，

专知会员服务

92+阅读 · 2020年1月13日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

已删除

将门创投

7+阅读 · 2018年11月5日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Soft-NMS – Improving Object Detection With One Line of Code

Soft-NMS – Improving Object Detection With One Line of Code

统计学习与视觉计算组

6+阅读 · 2018年3月30日

相关论文

On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks

Arxiv

0+阅读 · 2021年6月15日

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

Arxiv

0+阅读 · 2021年6月14日

Double-descent curves in neural networks: a new perspective using Gaussian processes

Arxiv

0+阅读 · 2021年6月13日

Nonparametric Learning of Two-Layer ReLU Residual Units

Nonparametric Learning of Two-Layer ReLU Residual Units

Arxiv

0+阅读 · 2021年6月11日

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime

Arxiv

1+阅读 · 2021年6月11日

Double Descent and Other Interpolation Phenomena in GANs

Arxiv

0+阅读 · 2021年6月7日

Orthogonal Over-Parameterized Training

Arxiv

0+阅读 · 2021年6月5日

Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks

Arxiv

0+阅读 · 2021年6月4日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

微信扫码咨询专知VIP会员