The support vector machine (SVM) is a well-established classification method whose name refers to the particular training examples, called support vectors, that determine the maximum margin separating hyperplane. The SVM classifier is known to enjoy good generalization properties when the number of support vectors is small compared to the number of training examples. However, recent research has shown that in sufficiently high-dimensional linear classification problems, the SVM can generalize well despite a proliferation of support vectors where all training examples are support vectors. In this paper, we identify new deterministic equivalences for this phenomenon of support vector proliferation, and use them to (1) substantially broaden the conditions under which the phenomenon occurs in high-dimensional settings, and (2) prove a nearly matching converse result.
翻译:支持矢量机(SVM)是一种既定的分类方法,其名称指具体的培训实例,称为支持矢量,用以确定超高平面的最大间隔。已知SVM分类器在支持矢量与培训实例相比数量较少时享有良好的概括性特性。然而,最近的研究表明,在足够高的高度线性分类问题中,SVM尽管支持矢量激增,而所有培训实例都是支持矢量,但仍可以很好地概括。在本文件中,我们确定了这种支持矢量扩散现象的新的确定性等同物,并使用这些等同物来(1) 大大拓宽该现象在高度环境中发生的条件,(2) 证明一种几乎相匹配的结果。