We present a unifying picture of PAC-Bayesian and mutual information-based upper bounds on the generalization error of randomized learning algorithms. As we show, Tong Zhang's information exponential inequality (IEI) gives a general recipe for constructing bounds of both flavors. We show that several important results in the literature can be obtained as simple corollaries of the IEI under different assumptions on the loss function. Moreover, we obtain new bounds for data-dependent priors and unbounded loss functions. Optimizing the bounds gives rise to variants of the Gibbs algorithm, for which we discuss two practical examples for learning with neural networks, namely, Entropy- and PAC-Bayes- SGD. Further, we use an Occam's factor argument to show a PAC-Bayesian bound that incorporates second-order curvature information of the training loss.
翻译:我们展示了PAC-Bayesian和相互基于信息的关于随机学习算法一般误差的一致图像。 正如我们所显示的那样,唐张的信息指数不平等(IEI)为构建两种口味的界限提供了通用的配方。我们显示,根据对损失函数的不同假设,文献中的一些重要结果可以作为IEI的简单卷轴获得。此外,我们获得了数据依赖的前题和无约束损失函数的新框。优化界限产生了Gibbs算法的变体,为此我们讨论了与神经网络学习的两个实际例子,即Entropy和PAC-Bayes-SGD。此外,我们用Occam系数的参数来显示PAC-Bayesian的界限,其中包含培训损失的二次曲线信息。