We prove bounds on the population risk of the maximum margin algorithm for two-class linear classification. For linearly separable training data, the maximum margin algorithm has been shown in previous work to be equivalent to a limit of training with logistic loss using gradient descent, as the training error is driven to zero. We analyze this algorithm applied to random data including misclassification noise. Our assumptions on the clean data include the case in which the class-conditional distributions are standard normal distributions. The misclassification noise may be chosen by an adversary, subject to a limit on the fraction of corrupted labels. Our bounds show that, with sufficient over-parameterization, the maximum margin algorithm trained on noisy data can achieve nearly optimal population risk.
翻译:对于线性可分离的培训数据,在先前的工作中显示的最大差值算法相当于使用梯度下降进行后勤损失培训的限度,因为培训错误被推到零。我们分析这种算法适用于随机数据,包括错误分类噪音。我们对清洁数据的假设包括等级条件分布为标准正常分布的情况。错误分类的噪音可由对手选择,但受损坏标签部分的限制。我们的界限表明,如果足够多的参数化,根据噪音数据培训的最大差值算法可以实现几乎最佳的人口风险。