PAC-Bayes在小型数据制度中能有多紧张? (How Tight Can PAC-Bayes be in the Small Data Regime?) - 专知论文

会员服务 ·

0

情景 · CASE · Performer · Better · 学成 ·

2022 年 1 月 13 日

How Tight Can PAC-Bayes be in the Small Data Regime?

翻译：PAC-Bayes在小型数据制度中能有多紧张?

Andrew Y. K. Foong,Wessel P. Bruinsma,David R. Burt,Richard E. Turner

from arxiv, Published at Neural Information Processing Systems 2021

In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made? For such small datasets, test set bounds adversely affect generalisation performance by withholding data from the training procedure. In this setting, PAC-Bayes bounds are especially attractive, due to their ability to use all the data to simultaneously learn a posterior and bound its generalisation risk. We focus on the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes theorem of Germain et al. While their theorem is known to recover many existing PAC-Bayes bounds, it is unclear what the tightest bound derivable from their framework is. For a fixed learning algorithm and dataset, we show that the tightest possible bound coincides with a bound considered by Catoni; and, in the more natural case of distributions over datasets, we establish a lower bound on the best bound achievable in expectation. Interestingly, this lower bound recovers the Chernoff test set bound if the posterior is equal to the prior. Moreover, to illustrate how tight these bounds can be, we study synthetic one-dimensional classification tasks in which it is feasible to meta-learn both the prior and the form of the bound to numerically optimise for the tightest bounds possible. We find that in this simple, controlled scenario, PAC-Bayes bounds are competitive with comparable, commonly used Chernoff test set bounds. However, the sharpest test set bounds still lead to better guarantees on the generalisation error than the PAC-Bayes bounds we consider.

翻译：在本文中,我们调查了这样一个问题:鉴于数据点数量少,例如N=30,PAC-Bayes和测试设定界限能有多紧?对于这些小的数据集来说,测试设定界限会通过从培训程序中扣留数据而对概括性业绩产生不利影响。在这一背景下,PAC-Bayes的界限特别有吸引力,因为它们能够使用所有数据同时学习后方数据并约束其概括性风险。我们侧重于具有约束性损失的i.d.数据案例,并考虑通用的PAC-Bayes和Germain等人的通用PAC-Bayesorem。虽然它们的约束性界限已知可以恢复现有的许多PAC-Bayes的界限,但不清楚从它们的框架中衍生出的最紧密的界限会影响总体性业绩。对于固定的算法和数据集,我们显示,最可能最紧密的界限与Catonii的界限相吻合;对于数据集的分布,我们仍然在最佳的界限上设定一个较低的界限。有趣的是,这种更低的束缚性是Chanoffer 测试在前的界限中,如果我们所使用的一种固定式测试,那么,我们所设定的上限,那么,那么,在之前的尺寸的尺寸的尺寸的尺寸的尺寸试验也是相同的。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

冗余字典下的压缩感知理论及应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于压缩感知和非负矩阵分解理论的高光谱混合像元分解

国家自然科学基金

0+阅读 · 2012年12月31日

基于球调和分析理论的信号稀疏表示与重构算法

国家自然科学基金

1+阅读 · 2012年12月31日

基于稳健统计的SAR图像配准方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于平行因子压缩感知理论的阵列信号处理算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

地震信号分析中具有波形形态信息约束的超完备字典构造方法及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于动态混合故障模型和进化博弈论的可生存性分析方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Making Progress Based on False Discoveries

Arxiv

0+阅读 · 2022年4月19日

Optimal Subsampling for High-dimensional Ridge Regression

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

Semiparametric Efficient G-estimation with Invalid Instrumental Variables

Arxiv

0+阅读 · 2022年4月17日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年4月17日

PAC-Bayesian Based Adaptation for Regularized Learning

Arxiv

1+阅读 · 2022年4月16日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

auton-survival: an Open-Source Package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Event Data

Arxiv

0+阅读 · 2022年4月15日

Testing distributional assumptions of learning algorithms

Arxiv

0+阅读 · 2022年4月14日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Making Progress Based on False Discoveries

Arxiv

0+阅读 · 2022年4月19日

Optimal Subsampling for High-dimensional Ridge Regression

Arxiv

0+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

Semiparametric Efficient G-estimation with Invalid Instrumental Variables

Arxiv

0+阅读 · 2022年4月17日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2022年4月17日

PAC-Bayesian Based Adaptation for Regularized Learning

Arxiv

1+阅读 · 2022年4月16日

On the dimensional indeterminacy of one-wave factor analysis under causal effects

Arxiv

0+阅读 · 2022年4月15日

auton-survival: an Open-Source Package for Regression, Counterfactual Estimation, Evaluation and Phenotyping with Censored Time-to-Event Data

Arxiv

0+阅读 · 2022年4月15日

Testing distributional assumptions of learning algorithms

Arxiv

0+阅读 · 2022年4月14日

相关基金

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

采用pinball loss的MEE算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

冗余字典下的压缩感知理论及应用研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于压缩感知和非负矩阵分解理论的高光谱混合像元分解

国家自然科学基金

0+阅读 · 2012年12月31日

基于球调和分析理论的信号稀疏表示与重构算法

国家自然科学基金

1+阅读 · 2012年12月31日

基于稳健统计的SAR图像配准方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于平行因子压缩感知理论的阵列信号处理算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

地震信号分析中具有波形形态信息约束的超完备字典构造方法及其应用

国家自然科学基金

0+阅读 · 2012年12月31日

基于动态混合故障模型和进化博弈论的可生存性分析方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员