Area under ROC curve (AUC) is a widely used performance measure for classification models. We propose a new distributionally robust AUC maximization model (DR-AUC) that relies on the Kantorovich metric and approximates the AUC with the hinge loss function. We use duality theory to reformulate the DR-AUC model as a tractable convex quadratic optimization problem. The numerical experiments show that the proposed DR-AUC model -- benchmarked with the standard deterministic AUC and the support vector machine models - improves the out-of-sample performance over the majority of the considered datasets. The results are particularly encouraging since our numerical experiments are conducted with training sets of small size which have been known to be conducive to low out-of-sample performance.
翻译:ROC曲线(AUC)下的区域是用于分类模型的广泛使用的性能衡量标准。我们提出了一个新的分布稳健的AUC最大化模型(DR-AUC),该模型依赖Kantorovich 测量标准,并接近AUC, 使用断链损失功能。我们用双重理论重塑DR-AUC模型,将其作为可移植的锥形二次优化问题。数字实验表明,拟议的DR-AUC模型(以标准确定性AUC和辅助矢量机模型为基准)改进了大多数考虑数据集的超标性性能。结果特别令人鼓舞,因为我们的数值实验是小规模的训练,已知这些训练有助于低标点外性能。