信息瓶颈:(量化)神经网络精确分析 (Information Bottleneck: Exact Analysis of (Quantized) Neural Networks) - 专知论文

会员服务 ·

0

INFORMS · Neural Networks · 层 · Principle · 隐藏层 ·

2021 年 6 月 24 日

Information Bottleneck: Exact Analysis of (Quantized) Neural Networks

翻译：信息瓶颈:(量化)神经网络精确分析

Stephan Sloth Lorenzen,Christian Igel,Mads Nielsen

The information bottleneck (IB) principle has been suggested as a way to analyze deep neural networks. The learning dynamics are studied by inspecting the mutual information (MI) between the hidden layers and the input and output. Notably, separate fitting and compression phases during training have been reported. This led to some controversy including claims that the observations are not reproducible and strongly dependent on the type of activation function used as well as on the way the MI is estimated. Our study confirms that different ways of binning when computing the MI lead to qualitatively different results, either supporting or refusing IB conjectures. To resolve the controversy, we study the IB principle in settings where MI is non-trivial and can be computed exactly. We monitor the dynamics of quantized neural networks, that is, we discretize the whole deep learning system so that no approximation is required when computing the MI. This allows us to quantify the information flow without measurement errors. In this setting, we observed a fitting phase for all layers and a compression phase for the output layer in all experiments; the compression in the hidden layers was dependent on the type of activation function. Our study shows that the initial IB results were not artifacts of binning when computing the MI. However, the critical claim that the compression phase may not be observed for some networks also holds true.

翻译：信息瓶颈(IB) 原则已被建议为分析深层神经网络的一种方法。学习动态是通过检查隐藏层与输入和输出之间的相互信息( MI) 来研究的。值得注意的是, 在培训期间报告了不同的安装和压缩阶段。这引起了一些争议, 其中包括: 观测结果无法复制, 并且在很大程度上取决于所使用的激活功能类型以及MI的估算方式。我们的研究证实, 计算MI 时, 以不同方式宾入的方式导致质量不同的结果, 无论是支持还是拒绝 IB 猜想。为了解决争议, 我们研究了在MI 并非三轨且可以精确计算的环境下的 IB 原则。我们监测了四分化神经网络的动态, 也就是说, 我们将整个深层学习系统分解, 这样在计算MI 时不需要近似值。这使得我们可以在不误判测量的情况下量化信息流。在这个环境中, 我们观察到, 计算所有层次和所有输出层的压缩阶段都有一个适当的阶段; 隐藏层的压缩取决于激活功能的类型。我们的研究显示, 在初始的 IB 阶段, 也显示, 关键的IMB 的压缩结果可能没有被观察到。

1

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

ICLR 2019论文解读：量化神经网络

ICLR 2019论文解读：量化神经网络

机器之心

9+阅读 · 2019年6月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Sparsifying Neural Network Connections for Face Recognition

Sparsifying Neural Network Connections for Face Recognition

统计学习与视觉计算组

7+阅读 · 2017年6月10日

Scalable and Modular Robustness Analysis of Deep Neural Networks

Scalable and Modular Robustness Analysis of Deep Neural Networks

Arxiv

0+阅读 · 2021年8月26日

Fast parallel calculation of modified Bessel function of the second kind and its derivatives

Arxiv

0+阅读 · 2021年8月26日

Discretization of parameter identification in PDEs using Neural Networks

Arxiv

0+阅读 · 2021年8月24日

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Arxiv

0+阅读 · 2021年8月24日

Hessian Aware Quantization of Spiking Neural Networks

Arxiv

0+阅读 · 2021年8月23日

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Arxiv

4+阅读 · 2021年7月5日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

ICLR 2019论文解读：量化神经网络

ICLR 2019论文解读：量化神经网络

机器之心

9+阅读 · 2019年6月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

CCF A类 | 顶级会议RTSS 2019诚邀稿件

CCF A类 | 顶级会议RTSS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年4月17日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Sparsifying Neural Network Connections for Face Recognition

Sparsifying Neural Network Connections for Face Recognition

统计学习与视觉计算组

7+阅读 · 2017年6月10日

相关论文

Scalable and Modular Robustness Analysis of Deep Neural Networks

Scalable and Modular Robustness Analysis of Deep Neural Networks

Arxiv

0+阅读 · 2021年8月26日

Fast parallel calculation of modified Bessel function of the second kind and its derivatives

Arxiv

0+阅读 · 2021年8月26日

Discretization of parameter identification in PDEs using Neural Networks

Arxiv

0+阅读 · 2021年8月24日

On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Arxiv

0+阅读 · 2021年8月24日

Hessian Aware Quantization of Spiking Neural Networks

Arxiv

0+阅读 · 2021年8月23日

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Arxiv

4+阅读 · 2021年7月5日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Premise selection with neural networks and distributed representation of features

Arxiv

3+阅读 · 2018年7月26日

Bayesian Convolutional Neural Networks

Arxiv

19+阅读 · 2018年6月27日

微信扫码咨询专知VIP会员