Learning curves plot the expected error of a learning algorithm as a function of the number of labeled samples it receives from a target distribution. They are widely used as a measure of an algorithm's performance, but classic PAC learning theory cannot explain their behavior. As observed by Antos and Lugosi (1996 , 1998), the classic `No Free Lunch' lower bounds only trace the upper envelope above all learning curves of specific target distributions. For a concept class with VC dimension $d$ the classic bound decays like $d/n$, yet it is possible that the learning curve for \emph{every} specific distribution decays exponentially. In this case, for each $n$ there exists a different `hard' distribution requiring $d/n$ samples. Antos and Lugosi asked which concept classes admit a `strong minimax lower bound' -- a lower bound of $d'/n$ that holds for a fixed distribution for infinitely many $n$. We solve this problem in a principled manner, by introducing a combinatorial dimension called VCL that characterizes the best $d'$ for which $d'/n$ is a strong minimax lower bound. Our characterization strengthens the lower bounds of Bousquet, Hanneke, Moran, van Handel, and Yehudayoff (2021), and it refines their theory of learning curves, by showing that for classes with finite VCL the learning rate can be decomposed into a linear component that depends only on the hypothesis class and an exponential component that depends also on the target distribution. As a corollary, we recover the lower bound of Antos and Lugosi (1996 , 1998) for half-spaces in $\mathbb{R}^d$. Finally, to provide another viewpoint on our work and how it compares to traditional PAC learning bounds, we also present an alternative formulation of our results in a language that is closer to the PAC setting.
翻译:学习曲线描绘了学习算法的预期错误, 以它从目标分布中获得的标签样本数量计 。 它们被广泛用作算法性能的衡量标准, 但典型的 PAC 学习理论无法解释它们的行为 。 正如Antos 和 Lugosi (1996, 1998) 所观察的, 经典的“ 无午餐” 下界只追溯到高于特定目标分布的所有学习曲线的上封封封。 对于具有 VC 维度的经典约束值, 如 $d/ n$, 这样的概念类, 通常会像 $d/ n$, 然而, 特定分布的学习曲线会以指数的速度加速。 在1996 年, 每个美元存在不同的“ 硬” 分布, 需要 $/ n美元 样本。 安托斯和 Lugos 询问, 哪个概念类会接受“ 强的迷你马克斯 下界 ” 。 另一种美元/ n$ 的下界, 它会维持着固定的固定分布方式, 通常是 $ 。 我们用一个原则解说的方式解决这个问题,, 通过引入一个叫做 VCLCLL, 更低的维 。