学习双层神经网络中的时间尺度 (Learning time-scales in two-layers neural networks) - 专知论文

会员服务 ·

0

Learning · Networking · Neural Networks · 经验风险 · MoDELS ·

2023 年 3 月 16 日

Learning time-scales in two-layers neural networks

翻译：学习双层神经网络中的时间尺度

Raphaël Berthier,Andrea Montanari,Kangjie Zhou

from arxiv, 54 pages, 9 figures

Gradient-based learning in multi-layer neural networks displays a number of striking features. In particular, the decrease rate of empirical risk is non-monotone even after averaging over large batches. Long plateaus in which one observes barely any progress alternate with intervals of rapid decrease. These successive phases of learning often take place on very different time scales. Finally, models learnt in an early phase are typically `simpler' or `easier to learn' although in a way that is difficult to formalize. Although theoretical explanations of these phenomena have been put forward, each of them captures at best certain specific regimes. In this paper, we study the gradient flow dynamics of a wide two-layer neural network in high-dimension, when data are distributed according to a single-index model (i.e., the target function depends on a one-dimensional projection of the covariates). Based on a mixture of new rigorous results, non-rigorous mathematical derivations, and numerical simulations, we propose a scenario for the learning dynamics in this setting. In particular, the proposed evolution exhibits separation of timescales and intermittency. These behaviors arise naturally because the population gradient flow can be recast as a singularly perturbed dynamical system.

翻译：多层神经网络的基于梯度的学习表现出许多显著特征。特别是，即使在大批处理上平均后，经验风险的减少速率也是不单调的。几乎没有进展的长平台交替出现，与快速下降的间隔相互交替。这些连续的学习阶段通常采用非常不同的时间尺度。最后，在早期阶段学习的模型通常是“更简单”或“更容易学习”，尽管这是一种很难形式化的方式。虽然已经提出了这些现象的理论解释，但是每个解释最多只能捕捉到某些特定的区域。在本文中，我们研究了高维下分布根据单指数模型（即，目标函数依赖于协变量的一维投影）分布的宽两层神经网络的梯度流动力学。基于新的严格结果、非严格的数学推导和数值模拟的混合物，我们提出了在这种情况下学习动态的一种情景。特别是，所提出的演化呈现出时间尺度分离和间歇性。这些行为是自然而然产生的，因为人口梯度流可以被重构为一个奇异扰动动力系统。

0

相关内容

Learning

神经网络数学基础，45页ppt

神经网络数学基础，45页ppt

专知会员服务

83+阅读 · 2023年5月7日

神经网络的持续终身学习综述论文

专知会员服务

44+阅读 · 2021年5月19日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Agrin在脑卒中后运动促进突触再生中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

泥石流输沙特征及输沙定量计算方法

国家自然科学基金

0+阅读 · 2014年12月31日

网络的小世界结构及其上随机游动的混合时

国家自然科学基金

1+阅读 · 2014年12月31日

视觉学习与人脑可塑性

国家自然科学基金

3+阅读 · 2014年12月31日

平移不变子空间的结构

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

超越梯度近似的旋量Boltzmann方程及其在自旋电子学中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

干湿交替过程中土壤氧化铁形态转化对As和Sb环境化学行为的影响机制

国家自然科学基金

0+阅读 · 2011年12月31日

前馈神经网络的奇异学习动态研究

国家自然科学基金

0+阅读 · 2008年12月31日

Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features

Arxiv

0+阅读 · 2023年5月9日

Backpropagation-free Training of Deep Physical Neural Networks

Arxiv

0+阅读 · 2023年5月9日

Riesz networks: scale invariant neural networks in a single forward pass

Arxiv

0+阅读 · 2023年5月8日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

神经网络数学基础，45页ppt

神经网络数学基础，45页ppt

专知会员服务

83+阅读 · 2023年5月7日

神经网络的持续终身学习综述论文

专知会员服务

44+阅读 · 2021年5月19日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【新书】贝叶斯网络进展与新应用，附全书下载

【新书】贝叶斯网络进展与新应用，附全书下载

专知会员服务

122+阅读 · 2019年12月9日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

Reinforcement Learning: An Introduction 2018第二版 500页

Reinforcement Learning: An Introduction 2018第二版 500页

CreateAMind

14+阅读 · 2018年4月27日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features

Arxiv

0+阅读 · 2023年5月9日

Backpropagation-free Training of Deep Physical Neural Networks

Arxiv

0+阅读 · 2023年5月9日

Riesz networks: scale invariant neural networks in a single forward pass

Arxiv

0+阅读 · 2023年5月8日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Dynamic Graph Neural Networks

Arxiv

24+阅读 · 2018年10月24日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

相关基金

Chemerin通过调节p38MAPK通路参与动脉粥样硬化分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Agrin在脑卒中后运动促进突触再生中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

泥石流输沙特征及输沙定量计算方法

国家自然科学基金

0+阅读 · 2014年12月31日

网络的小世界结构及其上随机游动的混合时

国家自然科学基金

1+阅读 · 2014年12月31日

视觉学习与人脑可塑性

国家自然科学基金

3+阅读 · 2014年12月31日

平移不变子空间的结构

国家自然科学基金

0+阅读 · 2013年12月31日

语音识别中的稀疏性深度学习

国家自然科学基金

11+阅读 · 2012年12月31日

超越梯度近似的旋量Boltzmann方程及其在自旋电子学中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

干湿交替过程中土壤氧化铁形态转化对As和Sb环境化学行为的影响机制

国家自然科学基金

0+阅读 · 2011年12月31日

前馈神经网络的奇异学习动态研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员