压缩两层神经网络时的尖锐消弹器 (Sharp asymptotics on the compression of two-layer neural networks) - 专知论文

会员服务 ·

0

Networking · Weight · Neural Networks · 优化器 · 缩放 ·

2022 年 8 月 16 日

Sharp asymptotics on the compression of two-layer neural networks

翻译：压缩两层神经网络时的尖锐消弹器

Mohammad Hossein Amani,Simone Bombari,Marco Mondelli,Rattana Pukdee,Stefano Rini

In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M<N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L_2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools from high-dimensional probability, we show that this non-convex problem can be simplified when the target network is sufficiently over-parameterized, and provide the error rate of this approximation as a function of the input dimension and N. In this mean-field limit, the simplified objective, as well as the optimal weights of the compressed network, does not depend on the realization of the target network, but only on expected scaling factors. Furthermore, for networks with ReLU activation, we conjecture that the optimum of the simplified optimization problem is achieved by taking weights on the Equiangular Tight Frame (ETF), while the scaling of the weights and the orientation of the ETF depend on the parameters of the target network. Numerical evidence is provided to support this conjecture.

翻译：在本文中,我们研究将带有N节点的目标双层神经网络压缩为M<N节点的压缩网络。更准确地说,我们考虑目标网络的权重是 i.d. ob-Gausian 的设置,在高山投入的假设下,将目标网络产出与压缩网络产出之间的人口L_2损失最小化,在高山投入的假设下,我们通过使用高维概率的工具,表明当目标网络足够过分时,可以简化这一非convex问题,并将这一近似误差率作为输入维度和N的函数提供。在这个平均值范围内,简化目标以及压缩网络的最佳权重并不取决于目标网络的实现,而只取决于预期的缩放因子。此外,对于使用ReLU 激活的网络,我们推测,通过对目标网络的权重进行权衡,可以实现简化优化问题的最佳程度,同时将Equicorgy Tight Fram(ETF)的重量和方向作为输入维度函数的函数。在这个平均限制中,ETF的重量和方向取决于目标网络的参数的参数。

0

相关内容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

三维石墨烯导电网络制备与宽光谱太阳电池探索

国家自然科学基金

0+阅读 · 2012年12月31日

ERF转录调控柑橘果实萜类芳香物质形成的机制

国家自然科学基金

0+阅读 · 2012年12月31日

通过Diels-Alder反应和多重氢键构筑具有自修复功能的PCL-PEG可逆交联网络及其三重形状记忆行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于准种特点的乙型肝炎病毒基因型耐药和表型耐药的规律性研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类四阶MEMS方程的解集结构与解的渐近性态

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

Probabilistic partition of unity networks for high-dimensional regression problems

Arxiv

0+阅读 · 2022年10月6日

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Arxiv

0+阅读 · 2022年10月5日

Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks

Arxiv

0+阅读 · 2022年10月4日

Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

Arxiv

0+阅读 · 2022年10月4日

Tail asymptotics for the bivariate skew normal in the general case

Arxiv

0+阅读 · 2022年10月4日

Stability Analysis and Generalization Bounds of Adversarial Training

Arxiv

0+阅读 · 2022年10月3日

Shuffled linear regression through graduated convex relaxation

Arxiv

0+阅读 · 2022年9月30日

ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Arxiv

0+阅读 · 2022年9月30日

Risk Control for Online Learning Models

Arxiv

0+阅读 · 2022年9月30日

Optimizing towards the best insertion-based error-tolerating joints

Arxiv

0+阅读 · 2022年9月30日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Probabilistic partition of unity networks for high-dimensional regression problems

Arxiv

0+阅读 · 2022年10月6日

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Arxiv

0+阅读 · 2022年10月5日

Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks

Arxiv

0+阅读 · 2022年10月4日

Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent

Arxiv

0+阅读 · 2022年10月4日

Tail asymptotics for the bivariate skew normal in the general case

Arxiv

0+阅读 · 2022年10月4日

Stability Analysis and Generalization Bounds of Adversarial Training

Arxiv

0+阅读 · 2022年10月3日

Shuffled linear regression through graduated convex relaxation

Arxiv

0+阅读 · 2022年9月30日

ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

Arxiv

0+阅读 · 2022年9月30日

Risk Control for Online Learning Models

Arxiv

0+阅读 · 2022年9月30日

Optimizing towards the best insertion-based error-tolerating joints

Arxiv

0+阅读 · 2022年9月30日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

拟南芥DIF（DRIP1-Interacting Factor）在胁迫信号应答中的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

三维石墨烯导电网络制备与宽光谱太阳电池探索

国家自然科学基金

0+阅读 · 2012年12月31日

ERF转录调控柑橘果实萜类芳香物质形成的机制

国家自然科学基金

0+阅读 · 2012年12月31日

通过Diels-Alder反应和多重氢键构筑具有自修复功能的PCL-PEG可逆交联网络及其三重形状记忆行为研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于准种特点的乙型肝炎病毒基因型耐药和表型耐药的规律性研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类四阶MEMS方程的解集结构与解的渐近性态

国家自然科学基金

0+阅读 · 2011年12月31日

多天线OFDM信道全信息压缩估计理论与方法

国家自然科学基金

0+阅读 · 2011年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员