深层神经网络初始凝聚相图的研究 (Phase Diagram of Initial Condensation for Two-layer Neural Networks) - 专知论文

会员服务 ·

0

初始化 · 神经网络 · 深层神经网络 · 非线性学习 · 超参数选择 ·

2023 年 4 月 8 日

Phase Diagram of Initial Condensation for Two-layer Neural Networks

翻译：深层神经网络初始凝聚相图的研究

Zhengan Chen,Yuqing Li,Tao Luo,Zhangchen Zhou,Zhi-Qin John Xu

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work by Luo et al.~\cite{luo2021phase}, we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

翻译：神经网络在初始化的不同尺度下表现出截然不同的行为模式一直是深度学习研究中的一个谜团。本文基于Luo等人~\cite{luo2021phase}的先前工作，提出了一个双层神经网络初始凝聚相图。凝聚是神经网络在训练过程中权重向量集中于孤立方向的现象，这是非线性学习过程中的一种特征，使神经网络具有更好的泛化能力。我们的相图旨在提供关于神经网络动态区域及其与初始化相关的超参数选择依赖性的全面理解。此外，我们详细阐述了小初始化导致初始训练阶段凝聚的潜在机制。

0

相关内容

初始化

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

大气中基于多面体阵元能量辐射源有噪无源定向模型及抗噪研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向ISM频段无线传感器网络的合作共存与优化技术

国家自然科学基金

0+阅读 · 2012年12月31日

从头设计蛋白质DS119折叠机制的分子模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉联合决策与估计的空间非合作机动目标在轨自主形态识别

国家自然科学基金

0+阅读 · 2011年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks

Arxiv

0+阅读 · 2023年5月29日

Understanding Sparse Feature Updates in Deep Networks using Iterative Linearisation

Arxiv

0+阅读 · 2023年5月26日

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks

Arxiv

0+阅读 · 2023年5月25日

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Arxiv

0+阅读 · 2023年5月25日

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Arxiv

0+阅读 · 2023年5月25日

VIP会员

文章信息

相关主题

深层神经网络

非线性学习

超参数选择

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

【图深度学习GDL论文大全】A comprehensive collection of recent papers on graph deep learning

专知会员服务

47+阅读 · 2019年12月1日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks

Arxiv

0+阅读 · 2023年5月29日

Understanding Sparse Feature Updates in Deep Networks using Iterative Linearisation

Arxiv

0+阅读 · 2023年5月26日

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks

Arxiv

0+阅读 · 2023年5月25日

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Arxiv

0+阅读 · 2023年5月25日

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Arxiv

0+阅读 · 2023年5月25日

相关基金

大气中基于多面体阵元能量辐射源有噪无源定向模型及抗噪研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向ISM频段无线传感器网络的合作共存与优化技术

国家自然科学基金

0+阅读 · 2012年12月31日

从头设计蛋白质DS119折叠机制的分子模拟研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视觉联合决策与估计的空间非合作机动目标在轨自主形态识别

国家自然科学基金

0+阅读 · 2011年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员