三:神经矩阵法 (Tensor Programs III: Neural Matrix Laws) - 专知论文

会员服务 ·

0

Weight · 相互独立的 · 变换 · Neural Networks · 矩阵论 ·

2020 年 11 月 30 日

Tensor Programs III: Neural Matrix Laws

翻译：三:神经矩阵法

In a neural network (NN), *weight matrices* linearly transform inputs into *preactivations* that are then transformed nonlinearly into *activations*. A typical NN interleaves multitudes of such linear and nonlinear transforms to express complex functions. Thus, the (pre-)activations depend on the weights in an intricate manner. We show that, surprisingly, (pre-)activations of a randomly initialized NN become *independent* from the weights as the NN's widths tend to infinity, in the sense of asymptotic freeness in random matrix theory. We call this the Free Independence Principle (FIP), which has these consequences: 1) It rigorously justifies the calculation of asymptotic Jacobian singular value distribution of an NN in Pennington et al. [36,37], essential for training ultra-deep NNs [48]. 2) It gives a new justification of gradient independence assumption used for calculating the Neural Tangent Kernel of a neural network. FIP and these results hold for any neural architecture. We show FIP by proving a Master Theorem for any Tensor Program, as introduced in Yang [50,51], generalizing the Master Theorems proved in those works. As warmup demonstrations of this new Master Theorem, we give new proofs of the semicircle and Marchenko-Pastur laws, which benchmarks our framework against these fundamental mathematical results.

翻译：在神经网络(NN)中,*重量矩阵* 线性矩阵将输入转换成* 状态,然后从非线性自由理论的意义上,将输入转换成* 状态* 。典型的 NN 间间隔将许多线性和非线性变异变异为表示复杂的功能。因此,(预)激活取决于权重的复杂方式。我们表明,令人惊讶的是,随机初始化的NNN的(预)激活成为* 依赖* 重力,因为NN的宽度趋向于无限,从随机矩阵理论的无线自由感来看。我们称之为自由独立原则(FIP),其后果如下:1) 它严格地说明计算NN在Pennington 和 al. [36,37] 的无线性Jacobian单值分布取决于权重的复杂方式。我们显示,对培训超深NNN(P) [48] 至关重要的(预) 。2 它提供了一个新的梯度独立假设的理由,用于计算神经网络的内核内核内衬。FIP和这些结果在任何基本矩阵结构结构中可以证明。

0

相关内容

Weight

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

【课程】Geoffrey Hinton《神经网络机器学习》经典课程，附课程PPT下载

【课程】Geoffrey Hinton《神经网络机器学习》经典课程，附课程PPT下载

专知会员服务

46+阅读 · 2019年11月4日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

胶囊网络资源汇总

胶囊网络资源汇总

论智

7+阅读 · 2018年3月10日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Evaluation of Error Probability of Classification Based on the Analysis of the Bayes Code: Extension and Example

Arxiv

0+阅读 · 2021年1月19日

A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems

Arxiv

0+阅读 · 2021年1月18日

Strings-and-Coins and Nimstring are PSPACE-complete

Arxiv

0+阅读 · 2021年1月16日

Learning Discrete Structures for Graph Neural Networks

Arxiv

6+阅读 · 2019年5月17日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

Inference in Probabilistic Graphical Models by Graph Neural Networks

Arxiv

3+阅读 · 2018年5月25日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Attentive Tensor Product Learning for Language Generation and Grammar Parsing

Arxiv

6+阅读 · 2018年2月20日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

相互独立的

Neural Networks

相关VIP内容

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

【EMNLP2019Keynote报告】神经序列模型， Neural Sequence Models，63页ppt

专知会员服务

27+阅读 · 2019年11月10日

【课程】Geoffrey Hinton《神经网络机器学习》经典课程，附课程PPT下载

【课程】Geoffrey Hinton《神经网络机器学习》经典课程，附课程PPT下载

专知会员服务

46+阅读 · 2019年11月4日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

胶囊网络资源汇总

胶囊网络资源汇总

论智

7+阅读 · 2018年3月10日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Evaluation of Error Probability of Classification Based on the Analysis of the Bayes Code: Extension and Example

Arxiv

0+阅读 · 2021年1月19日

A Nonmonotone Matrix-Free Algorithm for Nonlinear Equality-Constrained Least-Squares Problems

Arxiv

0+阅读 · 2021年1月18日

Strings-and-Coins and Nimstring are PSPACE-complete

Arxiv

0+阅读 · 2021年1月16日

Learning Discrete Structures for Graph Neural Networks

Arxiv

6+阅读 · 2019年5月17日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

Inference in Probabilistic Graphical Models by Graph Neural Networks

Arxiv

3+阅读 · 2018年5月25日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Attentive Tensor Product Learning for Language Generation and Grammar Parsing

Arxiv

6+阅读 · 2018年2月20日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

微信扫码咨询专知VIP会员