以辍学、批次正常化和跳过连接的方式普及 MLP (Generalizing MLPs With Dropouts, Batch Normalization, and Skip Connections) - 专知论文

会员服务 ·

0

Better · 跳跃连接 · 批量规范化 · 白化 · 深度前馈网络 ·

2021 年 8 月 18 日

Generalizing MLPs With Dropouts, Batch Normalization, and Skip Connections

翻译：以辍学、批次正常化和跳过连接的方式普及 MLP

from arxiv, 8 pages not including references

A multilayer perceptron (MLP) is typically made of multiple fully connected layers with nonlinear activation functions. There have been several approaches to make them better (e.g. faster convergence, better convergence limit, etc.). But the researches lack in more structured ways to test them. We test different MLP architectures by carrying out the experiments on the age and gender datasets. We empirically show that by whitening inputs before every linear layer and adding skip connections, our proposed MLP architecture can result in better performance. Since the whitening process includes dropouts, it can also be used to approximate Bayesian inference. We have open sourced our code released models and docker images at https://github.com/tae898/age-gender/.

翻译：多层透视器(MLP)通常由与非线性激活功能完全相连的多个层组成,有几种方法可以使其更好(例如更快的趋同、更好的趋同限制等),但研究缺乏更有条理的测试方法。我们通过在年龄和性别数据集上进行实验来测试不同的多层透视器结构。我们从经验上表明,通过在每个线性层之前进行输入白化和增加跳过连接,我们提议的MLP结构可以产生更好的性能。由于白化过程包括了辍学者,它也可以用来接近Bayesian的推断。我们已经在 https://github.com/tae898/age-gender/ 上打开了我们的代码发布模型和 docker 图像的来源。

0

相关内容

Better

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

自然语言处理现代方法，176页pdf

自然语言处理现代方法，176页pdf

专知会员服务

269+阅读 · 2021年2月22日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

77+阅读 · 2020年7月23日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

使用 Keras Tuner 调节超参数

使用 Keras Tuner 调节超参数

TensorFlow

15+阅读 · 2020年2月6日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Conditional Batch Normalization 详解

Conditional Batch Normalization 详解

极市平台

4+阅读 · 2019年4月12日

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

专知

21+阅读 · 2019年3月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices

Arxiv

1+阅读 · 2021年10月15日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Arxiv

8+阅读 · 2021年5月5日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Squeeze-and-Excitation Networks

Arxiv

3+阅读 · 2018年10月25日

Towards Understanding Regularization in Batch Normalization

Towards Understanding Regularization in Batch Normalization

Arxiv

4+阅读 · 2018年9月27日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

VIP会员

文章信息

相关主题

批量规范化

深度前馈网络

相关VIP内容

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

自然语言处理现代方法，176页pdf

自然语言处理现代方法，176页pdf

专知会员服务

269+阅读 · 2021年2月22日

Python图像处理，366页pdf，Image Operators Image Processing in Python

Python图像处理，366页pdf，Image Operators Image Processing in Python

专知会员服务

77+阅读 · 2020年7月23日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

【斯坦福大学】深度学习技巧速查清单《CS 230 - Deep Learning Tips and Tricks Cheatsheet》

专知会员服务

29+阅读 · 2019年12月19日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

使用 Keras Tuner 调节超参数

使用 Keras Tuner 调节超参数

TensorFlow

15+阅读 · 2020年2月6日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Conditional Batch Normalization 详解

Conditional Batch Normalization 详解

极市平台

4+阅读 · 2019年4月12日

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

专知

21+阅读 · 2019年3月26日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices

Arxiv

1+阅读 · 2021年10月15日

Training Graph Neural Networks with 1000 Layers

Arxiv

13+阅读 · 2021年6月14日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Arxiv

8+阅读 · 2021年5月5日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Squeeze-and-Excitation Networks

Arxiv

3+阅读 · 2018年10月25日

Towards Understanding Regularization in Batch Normalization

Towards Understanding Regularization in Batch Normalization

Arxiv

4+阅读 · 2018年9月27日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

Language Modeling with Gated Convolutional Networks

Arxiv

5+阅读 · 2017年9月8日

微信扫码咨询专知VIP会员