Multi-layer Perceptron Trainability Explained via Variability - 专知论文

会员服务 ·

0

感知机 · 泛函 · MoDELS · 相关系数 · Networking ·

2023 年 5 月 18 日

Multi-layer Perceptron Trainability Explained via Variability

翻译：暂无翻译

Yueyao Yu,Yin Zhang

Despite the tremendous successes of deep neural networks (DNNs) in various applications, many fundamental aspects of deep learning remain incompletely understood, including DNN trainability. In a trainability study, one aims to discern what makes one DNN model easier to train than another under comparable conditions. In particular, our study focuses on multi-layer perceptron (MLP) models equipped with the same number of parameters. We introduce a new notion called variability to help explain the benefits of deep learning and the difficulties in training very deep MLPs. Simply put, variability of a neural network represents the richness of landscape patterns in the data space with respect to well-scaled random weights. We empirically show that variability is positively correlated to the number of activations and negatively correlated to a phenomenon called "Collapse to Constant", which is related but not identical to the well-known vanishing gradient phenomenon. Experiments on a small stylized model problem confirm that variability can indeed accurately predict MLP trainability. In addition, we demonstrate that, as an activation function in MLP models, the absolute value function can offer better variability than the popular ReLU function can.

翻译：暂无翻译

0

相关内容

感知机

感知机在机器学习中，感知机是一种二进制分类器监督学习的算法。二值分类器是一个函数，它可以决定输入是否属于某个特定的类，输入由一个数字向量表示。它是一种线性分类器，即基于线性预测函数结合一组权值和特征向量进行预测的分类算法。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

睾酮缺乏致房颤易损基质（电重构和间质纤维化）的形成与microRNA调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

AMPK-Beclin-1/Vps34通路在维生素D3（Vit D)诱导足细胞自噬中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

混凝土坝裂缝性态转异机理及其诊断方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机泛函微分方程的动力学性态

国家自然科学基金

0+阅读 · 2012年12月31日

TRPP2-STIM1相互作用：脑缺血再灌注损伤新机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

复杂时滞耦合非线性系统中振荡动力学的分析及控制

国家自然科学基金

0+阅读 · 2012年12月31日

间充质干细胞减轻脂肪肝供肝缺血再灌注损伤的microRNA调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ti3SiC2 MAX 相薄膜的合成及其氦损伤特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

镁合金棘轮行为微观机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

颗粒破碎及其对颗粒材料破坏行为影响的宏细观模拟

国家自然科学基金

0+阅读 · 2008年12月31日

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Arxiv

0+阅读 · 2023年7月4日

Robust Implicit Regularization via Weight Normalization

Arxiv

0+阅读 · 2023年6月30日

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Arxiv

0+阅读 · 2023年6月30日

The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks

Arxiv

0+阅读 · 2023年6月30日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

On Explainability of Graph Neural Networks via Subgraph Explorations

Arxiv

11+阅读 · 2021年5月31日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

不确定环境下无人机三维路径规划研究 | 221页

远征作战军事后勤规划

大语言模型将如何改变军事指挥结构

美陆军能力集成与开发系统（ACIDS）流程指南 | 2025最新122页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Beyond In-Domain Scenarios: Robust Density-Aware Calibration

Arxiv

0+阅读 · 2023年7月4日

Robust Implicit Regularization via Weight Normalization

Arxiv

0+阅读 · 2023年6月30日

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Arxiv

0+阅读 · 2023年6月30日

The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks

Arxiv

0+阅读 · 2023年6月30日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

On Explainability of Graph Neural Networks via Subgraph Explorations

Arxiv

11+阅读 · 2021年5月31日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

睾酮缺乏致房颤易损基质（电重构和间质纤维化）的形成与microRNA调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

AMPK-Beclin-1/Vps34通路在维生素D3（Vit D)诱导足细胞自噬中的作用和机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

混凝土坝裂缝性态转异机理及其诊断方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

随机泛函微分方程的动力学性态

国家自然科学基金

0+阅读 · 2012年12月31日

TRPP2-STIM1相互作用：脑缺血再灌注损伤新机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

复杂时滞耦合非线性系统中振荡动力学的分析及控制

国家自然科学基金

0+阅读 · 2012年12月31日

间充质干细胞减轻脂肪肝供肝缺血再灌注损伤的microRNA调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Ti3SiC2 MAX 相薄膜的合成及其氦损伤特性研究

国家自然科学基金

0+阅读 · 2009年12月31日

镁合金棘轮行为微观机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

颗粒破碎及其对颗粒材料破坏行为影响的宏细观模拟

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员