Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the asymptotic regime, where it converges to the standard empirical risk minimization algorithm. Of particular relevance, our results highlight the role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.
翻译:开发了各种办法,将受监督的学习算法的普及错误置于上限,然而,现有的界限往往松散,在实际评估时甚至空洞。因此,它们可能无法说明学习算法的精确概括能力。我们的主要贡献是,对众所周知的Gibbs算法(a.k.a. Gibbs postior)的预期概括错误作了精确的描述,使用了不同的信息计量,特别是输入培训样本和输出假设之间的对称 KL 信息。我们的结果可以用来收紧现有的预期的普及错误和 PAC-Bayesian 的界限。我们的信息理论方法具有多功能性,因为它也描述了Gibs算法与数据依赖的正规化器的笼统错误,以及Gibs算法在隐性制度中的典型算法的典型错误,它与标准的经验风险最小化算法相融合。我们的结果特别相关,突出了对称称的KL信息在控制Gbbisloggal的普及错误方面所起的作用。