神经网络初始化相互信息信息:平均近地点 (Mutual Information of Neural Network Initialisations: Mean Field Approximations) - 专知论文

会员服务 ·

0

INFORMS · 互信息 · Neural Networks · Networking · Weight ·

2021 年 2 月 8 日

Mutual Information of Neural Network Initialisations: Mean Field Approximations

翻译：神经网络初始化相互信息信息:平均近地点

Jared Tanner,Giuseppe Ughi

The ability to train randomly initialised deep neural networks is known to depend strongly on the variance of the weight matrices and biases as well as the choice of nonlinear activation. Here we complement the existing geometric analysis of this phenomenon with an information theoretic alternative. Lower bounds are derived for the mutual information between an input and hidden layer outputs. Using a mean field analysis we are able to provide analytic lower bounds as functions of network weight and bias variances as well as the choice of nonlinear activation. These results show that initialisations known to be optimal from a training point of view are also superior from a mutual information perspective.

翻译：据知,培训随机初始深神经网络的能力在很大程度上取决于重量矩阵和偏差的差异以及非线性激活的选择。这里我们用信息理论替代方法补充目前对这一现象的几何分析。输入和隐藏层输出之间的相互信息取自较低界限。我们利用一种中性的实地分析,能够提供分析下界线,作为网络重量和偏差的函数,以及非线性激活的选择。这些结果显示,从培训角度已知最佳的初始化从相互信息角度来说也更优越。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

CIKM2020最佳论文出炉！NUS《图表示假新闻检测》摘获！

CIKM2020最佳论文出炉！NUS《图表示假新闻检测》摘获！

专知会员服务

26+阅读 · 2020年10月24日

【ACMMM2020-北航】协作双路径度量的小样本学习

【ACMMM2020-北航】协作双路径度量的小样本学习

专知会员服务

29+阅读 · 2020年8月11日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

68+阅读 · 2020年5月9日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

专知会员服务

64+阅读 · 2019年11月12日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

专知

7+阅读 · 2018年3月21日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Near field Acoustic Holography on arbitrary shapes using Convolutional Neural Network

Arxiv

0+阅读 · 2021年3月31日

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

Arxiv

0+阅读 · 2021年3月30日

The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry

Arxiv

0+阅读 · 2021年3月29日

Adaptive Surface Normal Constraint for Depth Estimation

Arxiv

1+阅读 · 2021年3月29日

A bandit-learning approach to multifidelity approximation

Arxiv

0+阅读 · 2021年3月29日

Online Flocking Control of UAVs with Mean-Field Approximation

Arxiv

0+阅读 · 2021年3月28日

Estimating informativeness of samples with Smooth Unique Information

Arxiv

0+阅读 · 2021年3月28日

Smooth Online Parameter Estimation for time varying VAR models with application to rat's LFP data

Arxiv

0+阅读 · 2021年3月26日

Estimating the reach of a manifold via its convexity defect function

Arxiv

0+阅读 · 2021年3月26日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

CIKM2020最佳论文出炉！NUS《图表示假新闻检测》摘获！

CIKM2020最佳论文出炉！NUS《图表示假新闻检测》摘获！

专知会员服务

26+阅读 · 2020年10月24日

【ACMMM2020-北航】协作双路径度量的小样本学习

【ACMMM2020-北航】协作双路径度量的小样本学习

专知会员服务

29+阅读 · 2020年8月11日

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

简明《神经网络数学》手册，16页pdf带你入门，Mathematics of Neural Networks

专知会员服务

68+阅读 · 2020年5月9日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

【互信息与自监督学习，32页ppt】'Notes and tutorials on "Mutual information and self-supervised learning‘“

专知会员服务

26+阅读 · 2019年12月25日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

【CCL 2019】表示学习--自然语言处理中的图神经网络（Graph Neural Networks in NLP），西湖大学长聘副教授张岳

专知会员服务

64+阅读 · 2019年11月12日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

【论文推荐】最新6篇目标跟踪相关论文—动态记忆网络、相关滤波器、单次学习、相关、循环自回归网络、三维多目标

专知

7+阅读 · 2018年3月21日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Near field Acoustic Holography on arbitrary shapes using Convolutional Neural Network

Arxiv

0+阅读 · 2021年3月31日

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

Arxiv

0+阅读 · 2021年3月30日

The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry

Arxiv

0+阅读 · 2021年3月29日

Adaptive Surface Normal Constraint for Depth Estimation

Arxiv

1+阅读 · 2021年3月29日

A bandit-learning approach to multifidelity approximation

Arxiv

0+阅读 · 2021年3月29日

Online Flocking Control of UAVs with Mean-Field Approximation

Arxiv

0+阅读 · 2021年3月28日

Estimating informativeness of samples with Smooth Unique Information

Arxiv

0+阅读 · 2021年3月28日

Smooth Online Parameter Estimation for time varying VAR models with application to rat's LFP data

Arxiv

0+阅读 · 2021年3月26日

Estimating the reach of a manifold via its convexity defect function

Arxiv

0+阅读 · 2021年3月26日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

微信扫码咨询专知VIP会员