大型非线性模型的线性:相切内核何时和为何恒定 (On the linearity of large non-linear models: when and why the tangent kernel is constant) - 专知论文

会员服务 ·

0

核化 · 线性的 · Networking · Neural Networks · 黑塞矩阵 ·

2020 年 12 月 5 日

On the linearity of large non-linear models: when and why the tangent kernel is constant

翻译：大型非线性模型的线性:相切内核何时和为何恒定

Chaoyue Liu,Libin Zhu,Mikhail Belkin

from arxiv, accepted as Spotlight in NeurIPS 2020

The goal of this work is to shed light on the remarkable phenomenon of transition to linearity of certain neural networks as their width approaches infinity. We show that the transition to linearity of the model and, equivalently, constancy of the (neural) tangent kernel (NTK) result from the scaling properties of the norm of the Hessian matrix of the network as a function of the network width. We present a general framework for understanding the constancy of the tangent kernel via Hessian scaling applicable to the standard classes of neural networks. Our analysis provides a new perspective on the phenomenon of constant tangent kernel, which is different from the widely accepted "lazy training". Furthermore, we show that the transition to linearity is not a general property of wide neural networks and does not hold when the last layer of the network is non-linear. It is also not necessary for successful optimization by gradient descent.

翻译：这项工作的目的是要揭示某些神经网络随着宽度接近而向线性转变的显著现象。我们的分析表明,向模型线性和(神经)相干内核(NTK)的耐久性过渡是网络宽度函数黑森矩阵规范的缩放性产物。我们提出了一个一般框架,用以理解通过赫森斯的缩放,通过适用于神经网络标准等级的黑森缩放而使相干内核的耐耐耐久性。我们的分析为恒定的凝固内核现象提供了一个新视角,该现象不同于广泛接受的“懒惰培训 ” 。此外,我们表明,向线性(NTK)的过渡不是广泛的神经网络的一般属性,在网络最后一层是非线性时并不维持。对于通过梯层成功优化也没有必要。

0

相关内容

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

【图神经网络(GNN)结构化数据分析】

【图神经网络(GNN)结构化数据分析】

专知会员服务

117+阅读 · 2020年3月22日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

一文读懂图卷积GCN

一文读懂图卷积GCN

计算机视觉life

21+阅读 · 2019年12月21日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【附源码】TensorFlow动态图（Eager模式）的那些神坑

【附源码】TensorFlow动态图（Eager模式）的那些神坑

专知

19+阅读 · 2018年10月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

On the definition of the likelihood function

Arxiv

0+阅读 · 2021年2月9日

Mutual Information of Neural Network Initialisations: Mean Field Approximations

Arxiv

0+阅读 · 2021年2月8日

What causes the test error? Going beyond bias-variance via ANOVA

Arxiv

0+阅读 · 2021年2月8日

Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

Arxiv

0+阅读 · 2021年2月7日

Almost sure convergence rates for Stochastic Gradient Descent and Stochastic Heavy Ball

Arxiv

0+阅读 · 2021年2月5日

Complex Networks of Functions

Arxiv

0+阅读 · 2021年2月4日

What Can Neural Networks Reason About?

Arxiv

10+阅读 · 2020年2月15日

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Arxiv

8+阅读 · 2019年11月4日

Universal Invariant and Equivariant Graph Neural Networks

Arxiv

5+阅读 · 2019年5月13日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

55+阅读 · 2020年3月26日

【图神经网络(GNN)结构化数据分析】

【图神经网络(GNN)结构化数据分析】

专知会员服务

117+阅读 · 2020年3月22日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

一文读懂图卷积GCN

一文读懂图卷积GCN

计算机视觉life

21+阅读 · 2019年12月21日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【附源码】TensorFlow动态图（Eager模式）的那些神坑

【附源码】TensorFlow动态图（Eager模式）的那些神坑

专知

19+阅读 · 2018年10月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

On the definition of the likelihood function

Arxiv

0+阅读 · 2021年2月9日

Mutual Information of Neural Network Initialisations: Mean Field Approximations

Arxiv

0+阅读 · 2021年2月8日

What causes the test error? Going beyond bias-variance via ANOVA

Arxiv

0+阅读 · 2021年2月8日

Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

Arxiv

0+阅读 · 2021年2月7日

Almost sure convergence rates for Stochastic Gradient Descent and Stochastic Heavy Ball

Arxiv

0+阅读 · 2021年2月5日

Complex Networks of Functions

Arxiv

0+阅读 · 2021年2月4日

What Can Neural Networks Reason About?

Arxiv

10+阅读 · 2020年2月15日

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

Arxiv

8+阅读 · 2019年11月4日

Universal Invariant and Equivariant Graph Neural Networks

Arxiv

5+阅读 · 2019年5月13日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员