Typical learning curves for Soft Margin Classifiers (SMCs) learning both realizable and unrealizable tasks are determined using the tools of Statistical Mechanics. We derive the analytical behaviour of the learning curves in the regimes of small and large training sets. The generalization errors present different decay laws towards the asymptotic values as a function of the training set size, depending on general geometrical characteristics of the rule to be learned. Optimal generalization curves are deduced through a fine tuning of the hyperparameter controlling the trade-off between the error and the regularization terms in the cost function. Even if the task is realizable, the optimal performance of the SMC is better than that of a hard margin Support Vector Machine (SVM) learning the same rule, and is very close to that of the Bayesian classifier.
翻译:使用统计机械学工具,确定SOMC学习可实现和无法实现任务的典型学习曲线。我们从小型和大型培训组体制中得出学习曲线的分析行为。一般化错误将不同的衰变法作为非现变法作为培训组规模的函数,这取决于所要学习规则的一般几何特点。通过对控制成本函数中错误与正规化条件之间的权衡的超参数进行微调,得出了最佳的通用曲线。即使任务可以实现,SMC的最佳性能也好于硬边支持矢量机(SVM)学习相同规则,而且非常接近Bayesian分类器。