We control the probability of the uniform deviation between empirical and generalization performances of multi-category classifiers by an empirical L1 -norm covering number when these performances are defined on the basis of the truncated hinge loss function. The only assumption made on the functions implemented by multi-category classifiers is that they are of bounded variation (BV). For such classifiers, we derive the sample size estimate sufficient for the mentioned performances to be close with high probability. Particularly, we are interested in the dependency of this estimate on the number C of classes. To this end, first, we upper bound the scale-sensitive version of the VC-dimension, the fat-shattering dimension of sets of BV functions defined on R^d which gives a O(1/epsilon^d ) as the scale epsilon goes to zero. Secondly, we provide a sharper decomposition result for the fat-shattering dimension in terms of C, which for sets of BV functions gives an improvement from O(C^(d/2 +1)) to O(Cln^2(C)). This improvement then propagates to the sample complexity estimate.
翻译:我们通过经验L1-norm来控制多类分类者的经验性表现和一般性表现之间统一偏差的概率,这种偏差的概率是由经验L1-norm来控制,如果这些表现是根据短短的断链损失函数来定义的,那么这些表现的概率就会以经验L1-norm来控制。对多类分类者所执行的功能的唯一假设是,这些功能是受约束的变异(BV)。对于这种分类者,我们得出样本大小的估计数足以使上述性能接近高概率。特别是,我们关心这种估计对类别C数的依赖性。为此,我们首先将VC分流的体格敏感版本,即R ⁇ d上定义的BV函数的脂肪散变维维维度,随着Epslon的升至零而使O(1/epsilon)值降为O(1/epsilon)值。第二,我们为C的脂肪散变异度的特性提供了更锐的分结果,对于BV函数的组合从O(d/2+1)到O(C)到O(C)2(2(C))。