We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
翻译:我们根据预测中的信息而不是培训算法的产出,为受监督的学习算法得出信息理论的概括界限。 这些界限比现有的信息理论界限有改进,适用于更广泛的算法,并解决两大挑战:(a) 它们给确定性算法带来有意义的结果,以及(b) 它们大大容易估计。 我们实验性地显示,拟议的界限密切跟踪了在实际情景中为深层次学习所存在的概括差距。