In this paper, we investigate the question: Given a small number of datapoints, for example N = 30, how tight can PAC-Bayes and test set bounds be made? For such small datasets, test set bounds adversely affect generalisation performance by discarding data. In this setting, PAC-Bayes bounds are especially attractive, due to their ability to use all the data to simultaneously learn a posterior and bound its generalisation risk. We focus on the case of i.i.d. data with a bounded loss and consider the generic PAC-Bayes theorem of Germain et al. (2009) and Begin et al. (2016). While their theorem is known to recover many existing PAC-Bayes bounds, it is unclear what the tightest bound derivable from their framework is. Surprisingly, we show that for a fixed learning algorithm and dataset, the tightest bound of this form coincides with the tightest bound of the more restrictive family of bounds considered in Catoni (2007). In contrast, in the more natural case of distributions over datasets, we give examples (both analytic and numerical) showing that the family of bounds in Catoni (2007) can be suboptimal. Within the proof framework of Germain et al. (2009) and Begin et al. (2016), we establish a lower bound on the best bound achievable in expectation, which recovers the Chernoff test set bound in the case when the posterior is equal to the prior. Finally, to illustrate how tight these bounds can potentially be, we study a synthetic one-dimensional classification task in which it is feasible to meta-learn both the prior and the form of the bound to obtain the tightest PAC-Bayes and test set bounds possible. We find that in this simple, controlled scenario, PAC-Bayes bounds are surprisingly competitive with comparable, commonly used Chernoff test set bounds. However, the sharpest test set bounds still lead to better guarantees on the generalisation error than the PAC-Bayes bounds we consider.


翻译:在本文中,我们调查了这样一个问题:鉴于数据点数量少,例如N=30,PAC-Bayes和测试设定界限能有多紧?对于如此小的数据集,测试设定界限会通过丢弃数据对概括性性表现产生不利影响。在这个设置中,PAC-Bayes的界限特别有吸引力,因为它们能够使用所有数据同时学习后方和约束其概括性风险。我们侧重于有约束性损失的i.d.数据,并考虑Germain等人(2009年)和Beng等人(2016年)的通用PAC-Bayes 理论。虽然已知测试设定界限会通过丢弃数据来恢复许多现有的PAC-Bayes界限,但测试界限的界限会受到不利影响。令人惊讶的是,对于固定的学习算法和数据集来说,这种形式的最紧密结合与Catonii(2007年)所考虑的更严格性定义的框框框框关系一致。相比之下,在数据设置前端和预置的框框框中,我们给出了一个更精确的框框框框框中,我们也可以在排序中看到一个更精确的框框框框框。

0
下载
关闭预览

相关内容

专知会员服务
42+阅读 · 2020年12月18日
专知会员服务
50+阅读 · 2020年12月14日
【干货书】机器学习速查手册,135页pdf
专知会员服务
125+阅读 · 2020年11月20日
知识图谱推理,50页ppt,Salesforce首席科学家Richard Socher
专知会员服务
108+阅读 · 2020年6月10日
商业数据分析,39页ppt
专知会员服务
160+阅读 · 2020年6月2日
已删除
将门创投
5+阅读 · 2019年4月29日
Arxiv
0+阅读 · 2021年7月30日
Arxiv
0+阅读 · 2021年7月29日
Recent advances in deep learning theory
Arxiv
50+阅读 · 2020年12月20日
Parsimonious Bayesian deep networks
Arxiv
5+阅读 · 2018年10月17日
Arxiv
3+阅读 · 2017年12月14日
VIP会员
相关VIP内容
相关资讯
已删除
将门创投
5+阅读 · 2019年4月29日
Top
微信扫码咨询专知VIP会员