We present a framework to derive bounds on the test loss of randomized learning algorithms for the case of bounded loss functions. This framework leads to bounds that depend on the conditional information density between the the output hypothesis and the choice of the training set, given a larger set of data samples from which the training set is formed. Furthermore, the bounds pertain to the average test loss as well as to its tail probability, both for the PAC-Bayesian and the single-draw settings. If the conditional information density is bounded uniformly in the size $n$ of the training set, our bounds decay as $1/n$, which is referred to as a fast rate. This is in contrast with the tail bounds involving conditional information measures available in the literature, which have a less benign $1/\sqrt{n}$ dependence. We demonstrate the usefulness of our tail bounds by showing that they lead to estimates of the test loss achievable with several neural network architectures trained on MNIST and Fashion-MNIST that match the state-of-the-art bounds available in the literature.
翻译:我们提出了一个框架,用于计算受约束损失函数的随机学习算法测试损失的界限。这个框架的界限取决于产出假设和选择培训组之间的有条件信息密度,因为培训组所依据的数据样本范围更大。此外,界限涉及平均测试损失及其尾部概率,对PAC-Bayesian和单拖式设置都是如此。如果有条件信息密度以培训组的美元大小统一捆绑,则我们的界限为1美元/n美元,称为快速速度。这与文献中包含有条件信息计量的尾端形成对照,而文献中的有条件信息计量则不那么温和1美元依赖性。我们通过显示它们能够导致估算测试损失,通过在MNIST和FAshion-MNIST上培训的若干神经网络结构,与文献中现有的状态和艺术界限相匹配,我们尾端的界限是有用的。