信息理论界限对学习算法泛化误差的紧致性研究 (On the tightness of information-theoretic bounds on generalization error of learning algorithms)

A recent line of works, initiated by Russo and Xu, has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of $O(\sqrt{\lambda/n})$ where $\lambda$ is some information-theoretic quantities such as the mutual information or conditional mutual information between the data and the learned hypothesis. However, such a learning rate is typically considered to be ``slow", compared to a ``fast rate" of $O(\lambda/n)$ in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate result can still be obtained using this bound under appropriate assumptions. Furthermore, we identify the critical conditions needed for the fast rate generalization error, which we call the $(\eta,c)$-central condition. Under this condition, we give information-theoretic bounds on the generalization error and excess risk, with a fast convergence rate for specific learning algorithms such as empirical risk minimization and its regularized version. Finally, several analytical examples are given to show the effectiveness of the bounds.

翻译：近期的一系列研究表明，Russo和Xu等人发起的研究，已经证明了学习算法的泛化误差可以用信息量进行上限的量化。在大多数相关研究中，预期的泛化误差的收敛速度通常是$O(\sqrt{\lambda/n})$的形式，在这里，$\lambda$是数据和学习假设之间的一些信息理论量，例如互信息或条件互信息。然而，这种学习速度通常被认为是“慢”的，与许多学习场景下的“快速速度” $O(\lambda/n)$ 相比。在这项工作中，我们首先展示了平方根不一定意味着慢速度的收敛率，通过合适的假设，仍然可以使用这个界限得到快速率的泛化误差结果。此外，我们确定了快速率泛化误差所需的关键条件，我们称之为$(\eta,c)$-中心条件。在这个条件下，我们给出了关于泛化误差和多余风险的信息理论界限，并为特定的学习算法，如经验风险最小化及其正则化版本，提供了快速的收敛率。最后，给出了几个分析例子，以展示这些界限的有效性。

相关内容

泛化误差

关注 106

学习方法的泛化能力（Generalization Error）是由该方法学习到的模型对未知数据的预测能力，是学习方法本质上重要的性质。现实中采用最多的办法是通过测试泛化误差来评价学习方法的泛化能力。泛化误差界刻画了学习算法的经验风险与期望风险之间偏差和收敛速度。一个机器学习的泛化误差（Generalization Error），是一个描述学生机器在从样品数据中学习之后，离教师机器之间的差距的函数。

Into the Metaverse，93页ppt介绍元宇宙概念、应用、趋势

专知会员服务

49+阅读 · 2022年2月19日

【经典书】主动学习理论，226页pdf，Theory of Active Learning

专知会员服务

127+阅读 · 2021年7月14日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日