PAC-Bayayes信息瓶颈 (PAC-Bayes Information Bottleneck) - 专知论文

会员服务 ·

0

INFORMS · 泛化理论 · Weight · 可辨认的 · CASES ·

2022 年 2 月 5 日

PAC-Bayes Information Bottleneck

翻译：PAC-Bayayes信息瓶颈

Zifeng Wang,Shao-Lun Huang,Ercan E. Kuruoglu,Jimeng Sun,Xi Chen,Yefeng Zheng

from arxiv, ICLR'22 (Spotlight)

Understanding the source of the superior generalization ability of NNs remains one of the most important problems in ML research. There have been a series of theoretical works trying to derive non-vacuous bounds for NNs. Recently, the compression of information stored in weights (IIW) is proved to play a key role in NNs generalization based on the PAC-Bayes theorem. However, no solution of IIW has ever been provided, which builds a barrier for further investigation of the IIW's property and its potential in practical deep learning. In this paper, we propose an algorithm for the efficient approximation of IIW. Then, we build an IIW-based information bottleneck on the trade-off between accuracy and information complexity of NNs, namely PIB. From PIB, we can empirically identify the fitting to compressing phase transition during NNs' training and the concrete connection between the IIW compression and the generalization. Besides, we verify that IIW is able to explain NNs in broad cases, e.g., varying batch sizes, over-parameterization, and noisy labels. Moreover, we propose an MCMC-based algorithm to sample from the optimal weight posterior characterized by PIB, which fulfills the potential of IIW in enhancing NNs in practice.

翻译：理解无核武器国家超常普及能力的来源仍然是多边研究中最重要的问题之一。一系列理论工作试图为无核武器国家得出非空线。最近,根据PAC-Bayes 论题,压缩储存的重量信息(IIW)已证明在NNS一般化中发挥关键作用。然而,一直没有提供国际妇女研究所的解决方案,这为进一步调查国际妇女研究所的财产及其在实际深层次学习中的潜力制造障碍。在本文件中,我们提出了高效接近国际妇女研究所的算法。然后,我们建立了基于IIW的信息瓶颈,说明NNS精确度与信息复杂性之间的取舍,即PIB。从PIB中,我们可以从经验上确定在NW培训期间是否适合压缩阶段过渡,以及国际妇女研究所压缩与普遍化之间的具体联系。此外,我们核实国际妇女研究所能够在广泛的案例中解释NNW,例如,不同批量的大小,超标度,超标定,然后,我们建立基于IMMM 和高压的BIB标准,我们提议从最优化的MBAgals。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

直播预告 | 斯坦福Jure组吴泰霖《Graph Information Bottleneck》

直播预告 | 斯坦福Jure组吴泰霖《Graph Information Bottleneck》

图与推荐

0+阅读 · 2021年10月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

基于交替方向乘子法的高效译码理论与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

DACI1 调控Cyt b6/f 复合物组装的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类随机过程的Karhunen-Loeve展开及小球概率估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于不确定先验知识的支持向量机理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

密度泛函和神经网络组合高效热化学方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于石墨烯的自旋量子比特的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于粗糙集理论的入侵检测方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

Imaging Conductivity from Current Density Magnitude using Neural Networks

Arxiv

0+阅读 · 2022年4月18日

PAC-Bayesian Based Adaptation for Regularized Learning

Arxiv

1+阅读 · 2022年4月16日

The Distributed Information Bottleneck reveals the explanatory structure of complex systems

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Hierarchical Embedded Bayesian Additive Regression Trees

Arxiv

0+阅读 · 2022年4月14日

Graph Structure Learning with Variational Information Bottleneck

Arxiv

11+阅读 · 2021年12月16日

Invariant Information Bottleneck for Domain Generalization

Invariant Information Bottleneck for Domain Generalization

Arxiv

15+阅读 · 2021年12月10日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维与高维空间中对潜在表征的分析、建模与变换

《美军使用大语言模型技术生成领域特定文档》2025最新379页

【NeurIPS 2025】以语言为中心的全模态表征学习的可扩展性研究

智能体化多模态大语言模型综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

直播预告 | 斯坦福Jure组吴泰霖《Graph Information Bottleneck》

直播预告 | 斯坦福Jure组吴泰霖《Graph Information Bottleneck》

图与推荐

0+阅读 · 2021年10月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Imaging Conductivity from Current Density Magnitude using Neural Networks

Arxiv

0+阅读 · 2022年4月18日

PAC-Bayesian Based Adaptation for Regularized Learning

Arxiv

1+阅读 · 2022年4月16日

The Distributed Information Bottleneck reveals the explanatory structure of complex systems

Arxiv

0+阅读 · 2022年4月15日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Hierarchical Embedded Bayesian Additive Regression Trees

Arxiv

0+阅读 · 2022年4月14日

Graph Structure Learning with Variational Information Bottleneck

Arxiv

11+阅读 · 2021年12月16日

Invariant Information Bottleneck for Domain Generalization

Invariant Information Bottleneck for Domain Generalization

Arxiv

15+阅读 · 2021年12月10日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

相关基金

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

基于交替方向乘子法的高效译码理论与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

DACI1 调控Cyt b6/f 复合物组装的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类随机过程的Karhunen-Loeve展开及小球概率估计的研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于不确定先验知识的支持向量机理论与算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

密度泛函和神经网络组合高效热化学方法

国家自然科学基金

0+阅读 · 2012年12月31日

基于石墨烯的自旋量子比特的理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

组合导航系统中基于混沌、小波和神经网络的信息融合方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

球面学习理论研究

国家自然科学基金

1+阅读 · 2008年12月31日

基于粗糙集理论的入侵检测方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员