根据PAC-Bayesian理论视角分析彩票票票票票票假设 (Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective) - 专知论文

会员服务 ·

0

泛化理论 · Learning · 极小值 · Attention · 尖锐最小值 ·

2022 年 6 月 2 日

Analyzing Lottery Ticket Hypothesis from PAC-Bayesian Theory Perspective

翻译：根据PAC-Bayesian理论视角分析彩票票票票票票假设

Keitaro Sakamoto,Issei Sato

The lottery ticket hypothesis (LTH) has attracted attention because it can explain why over-parameterized models often show high generalization ability. It is known that when we use iterative magnitude pruning (IMP), which is an algorithm to find sparse networks with high generalization ability that can be trained from the initial weights independently, called winning tickets, the initial large learning rate does not work well in deep neural networks such as ResNet. However, since the initial large learning rate generally helps the optimizer to converge to flatter minima, we hypothesize that the winning tickets have relatively sharp minima, which is considered a disadvantage in terms of generalization ability. In this paper, we confirm this hypothesis and show that the PAC-Bayesian theory can provide an explicit understanding of the relationship between LTH and generalization behavior. On the basis of our experimental findings that flatness is useful for improving accuracy and robustness to label noise and that the distance from the initial weights is deeply involved in winning tickets, we offer the PAC-Bayes bound using a spike-and-slab distribution to analyze winning tickets. Finally, we revisit existing algorithms for finding winning tickets from a PAC-Bayesian perspective and provide new insights into these methods.

翻译：彩票假设(LTH)吸引了人们的注意,因为它可以解释为什么过度参数化的模型往往表现出高度的概括化能力。众所周知,当我们使用迭代规模的剪切(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP))(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(IMP)(I)(IMP)(IMP)(IMP)(I)(IMP(IMP)(IMP)(I(IMP)(I)(IMP)(IMP(IMP)(IMP(IMP(I)(IMP)(I)(IMP(IMP)(IMP(IMP)(IMP(IMP))(I)(IMP)(I)(I)(IMP(IMP(IMP)(IMP)(IMP))(I)(IMP(IMP)(I)(IMP)(I)(IMP(I)(I)(I)(IMP)))(I))(I)(IMP)(I)(I)(I)(I(I)(I)(Ig)(Ig)(Ig)(Ig)(Ig)(Ig)(Ig)(Ig)(I(Ig)(Ig)(I)(I(I)(I(I(I(I(I)))))(I)(I)(I(I)(I)(I(I(I)(I)(I)(I)(I)(I)(I))(I)(I)(IMP)(I)))(I)(I)(I)(I)(I)(IMP)(I))(I)(IMP)(I)(I(I)(I(I(IMP)(IMP)(IMP

0

相关内容

泛化理论

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

植物分子设计中高维数据的低维稀疏逼近方法

国家自然科学基金

0+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-10a/b靶向调控ABCA1和ABCG1对胆固醇流出的影响

国家自然科学基金

0+阅读 · 2013年12月31日

双目标排序的近似算法

国家自然科学基金

0+阅读 · 2013年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿琥酯上调巨噬细胞抗菌性自噬作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest

Arxiv

1+阅读 · 2022年7月18日

Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective

Arxiv

0+阅读 · 2022年7月18日

Active Structure Learning of Bayesian Networks in an Observational Setting

Arxiv

0+阅读 · 2022年7月17日

Exact and asymptotic goodness-of-fit tests based on the maximum and its location of the empirical process

Arxiv

0+阅读 · 2022年7月16日

Practical tradeoffs between memory, compute, and performance in learned optimizers

Arxiv

0+阅读 · 2022年7月16日

Making Linear MDPs Practical via Contrastive Representation Learning

Arxiv

0+阅读 · 2022年7月14日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

尖锐最小值

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津大学博士论文】将序列结构与几何结构融入深度神经网络

工程视角：影响战争进程的小型无人机

企业级AI应用开发：从技术选型到生产落地

AI生成代码缺陷综述

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest

Arxiv

1+阅读 · 2022年7月18日

Gradient-based data and parameter dimension reduction for Bayesian models: an information theoretic perspective

Arxiv

0+阅读 · 2022年7月18日

Active Structure Learning of Bayesian Networks in an Observational Setting

Arxiv

0+阅读 · 2022年7月17日

Exact and asymptotic goodness-of-fit tests based on the maximum and its location of the empirical process

Arxiv

0+阅读 · 2022年7月16日

Practical tradeoffs between memory, compute, and performance in learned optimizers

Arxiv

0+阅读 · 2022年7月16日

Making Linear MDPs Practical via Contrastive Representation Learning

Arxiv

0+阅读 · 2022年7月14日

Forecasting: theory and practice

Arxiv

57+阅读 · 2022年1月5日

Collective Intelligence for Deep Learning: A Survey of Recent Developments

Arxiv

22+阅读 · 2021年12月22日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

植物分子设计中高维数据的低维稀疏逼近方法

国家自然科学基金

0+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-10a/b靶向调控ABCA1和ABCG1对胆固醇流出的影响

国家自然科学基金

0+阅读 · 2013年12月31日

双目标排序的近似算法

国家自然科学基金

0+阅读 · 2013年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

青蒿琥酯上调巨噬细胞抗菌性自噬作用的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

随机延时神经网络的吸引子和分岔

国家自然科学基金

1+阅读 · 2012年12月31日

Witten Laplacian的特征值及与其相关的Ricci Soliton研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员