In this work, we study the generalization capability of algorithms from an information-theoretic perspective. It has been shown that the expected generalization error of an algorithm is bounded from above by a function of the relative entropy between the conditional probability distribution of the algorithm's output hypothesis, given the dataset with which it was trained, and its marginal probability distribution. We build upon this fact and introduce a mathematical formulation to obtain upper bounds on this relative entropy. Assuming that the data is discrete, we then develop a strategy using this formulation, based on the method of types and typicality, to find explicit upper bounds on the generalization error of stable algorithms, i.e., algorithms that produce similar output hypotheses given similar input datasets. In particular, we show the bounds obtained with this strategy for the case of $\epsilon$-DP and $\mu$-GDP algorithms.
翻译:在这项工作中,我们从信息理论的角度研究算法的概括性能力。 已经证明,一个算法的预期一般化错误来自上面, 由算法输出假设的有条件概率分布(考虑到它所培训的数据集)与其边际概率分布(边际概率分布)之间的相对倍增性函数所连接。 我们以这一事实为基础, 引入一个数学配方, 以获得这个相对酶的上界。 假设数据是离散的, 我们然后根据类型和典型的方法, 制定一项使用这一配方的战略, 以找到稳定算法(即产生类似输入数据集的类似输出假设的算法)的一般性错误的明显上限。 特别是, 我们展示了以 $\ epsilon$- DP 和 $\ mu$- GDP 算法获得的界限 。