The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. Roughly speaking, the more complex the family, the greater the potential disparity between the training error and the population error of the classifier. This principle is embodied in layman's terms by Occam's razor principle, which suggests favoring low-complexity hypotheses over complex ones. We study a family of low-complexity classifiers consisting of thresholding the one-dimensional feature obtained by projecting the data on a random line after embedding it into a higher dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n (based on its performance on training data) is chosen. We obtain a bound on the generalization error of these low-complexity classifiers. The bound is less than that of any classifier with a non-trivial VC dimension, and thus less than that of a linear classifier. We also show that, given full knowledge of the class conditional densities, the error of the classifiers would converge to the optimal (Bayes) error as k and n go to infinity; if only a training dataset is given, we show that the classifiers will perfectly classify all the training points as k and n go to infinity.
翻译:分类器的普遍错误与选择分类器所在的一组功能的复杂性有关。 粗略地说, 家庭越复杂, 培训错误与分类器人口错误之间的潜在差异越大。 这一原则由Occam的剃刀原则体现在外人术语中, 这表示偏向于低复杂假设而非复杂假设。 我们研究的是低复杂分类的组合, 由在随机线上投射数据, 将数据嵌入一个更高维度的空间, 以单向 k。 更具体地说, 扩展数据是预测n- 时间, 并且选择了这些n( 根据其培训数据的性能) 中的最佳分类器。 我们从这些低复杂分类器的一般错误中获得了约束。 我们的研究范围小于任何非初始 VC 尺寸的分类器, 因而比线性分类器的大小要小。 我们还表明, 如果完全了解了班级的精确度和精确级级的分类方法, 只有当我们作为最精确级级级级级级级级级的训练者才具备最精确的精确性, 我们的分类方法才能将显示最精确性级级级级级级级级级级级的分类。