This paper presents an extension of the classical agnostic PAC learning model in which learning problems are modelled not only by a Hypothesis Space $\mathcal{H}$, but also by a Learning Space $\mathbb{L}(\mathcal{H})$, which is a cover of $\mathcal{H}$, constrained by a VC-dimension property, that is a suitable domain for Model Selection algorithms. Our main contribution is a data driven general learning algorithm to perform regularized Model Selection on $\mathbb{L}(\mathcal{H})$. A remarkable, formally proved, consequence of this approach are conditions on $\mathbb{L}(\mathcal{H})$ and on the loss function that lead to estimated out-of-sample error surfaces which are true U-curves on $\mathbb{L}(\mathcal{H})$ chains, enabling a more efficient search on $\mathbb{L}(\mathcal{H})$. To our knowledge, this is the first rigorous result asserting that a non exhaustive search of a family of candidate models can return an optimal solution. In this new framework, an U-curve optimization algorithm becomes a natural component of Model Selection, hence of learning algorithms. The abstract general framework proposed here may have important implications on modern learning models and on areas such as Neural Architecture Search.
翻译:本文展示了经典不可知的 PAC 学习模式的延伸, 其中学习问题不仅以假冒空间 $\ mathcal{H} $( mathcal{H}) $( mathbb{L}) $( mathcal{H}) $( mathcal{H}) $( ) 学习空间 $( mathbb{L} ) $ (\ mathcal{H} $) 建模模型选择模式。 一个引人注目的、 正式证明的, 这个方法的结果是学习空间 $\ mathbb{L} (\ mathcal{H} $( mindexple) 的 条件 。 以及一个损失功能, 导致对标本错误表面进行估算, 这是 $\ mathb{L} (\ mathcall} (mathcall{H} $ 链, 使得对 $\\ mathbb{L} (mathcalcal) {H} 进行更高效的搜索。