Bounding the generalization error of a supervised learning algorithm is one of the most important problems in learning theory, and various approaches have been developed. However, existing bounds are often loose and lack of guarantees. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contribution is an exact characterization of the expected generalization error of the well-known Gibbs algorithm in terms of symmetrized KL information between the input training samples and the output hypothesis. Such a result can be applied to tighten existing expected generalization error bound. Our analysis provides more insight on the fundamental role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.
翻译:在学习理论中,受监督的学习算法的概括错误是最重要的问题之一,已经制定了各种办法。但是,现有的界限往往松散,缺乏保障。因此,这些界限可能无法说明学习算法的确切概括能力。我们的主要贡献是对众所周知的Gibbs算法在输入培训样本和输出假设之间的平衡化 KL 信息方面预期的概括性错误进行精确的定性。这种结果可以用来加强现有的预期的概括性错误。我们的分析更深入地了解了对称的 KL 信息在控制Gibbs算法的概括性错误方面所起的基本作用。