We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds. Our framework also enables us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.
翻译:我们提出了一个推导PAC-Bayesian泛化界限的统一框架。与大多数先前的文献不同,我们的界限是任意有效的(即时间一致的),意味着它们在所有停止时间都成立,而不仅仅是针对固定的样本大小。我们的方法按以下顺序结合了四个工具:(a)非负超马丁戈和反向下马丁戈,(b)混合方法,(c)Donsker-Varadhan公式(或其他凸对偶原理)和(d)维尔不等式。我们的主要结果是一个PAC-Bayes定理,适用于广泛的离散随机过程。我们展示了这个结果如何导出广为人知的经典PAC-Bayes界限的时间一致版本,例如Seeger、McAllester、Maurer和Catoni的界限,以及许多最近的界限。我们还提出了几个新的界限。我们的框架还使我们能够放宽传统的假设;特别是,我们考虑非平稳损失函数和非i.i.d.数据。总之,我们统一了过去界限的推导并简化了未来界限的搜索:只需检查我们的超级马丁戈或下马丁戈条件是否满足,如果满足,则可以保证获得(时间一致的)PAC-Bayes界。