The Area Under the ROC Curve (AUC) is a crucial metric for machine learning, which evaluates the average performance over all possible True Positive Rates (TPRs) and False Positive Rates (FPRs). Based on the knowledge that a skillful classifier should simultaneously embrace a high TPR and a low FPR, we turn to study a more general variant called Two-way Partial AUC (TPAUC), where only the region with $\mathsf{TPR} \ge \alpha, \mathsf{FPR} \le \beta$ is included in the area. Moreover, recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics where only the FPR range is restricted, opening a new problem to seek solutions to leverage high TPAUC. Motivated by this, we present the first trial in this paper to optimize this new metric. The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss. To address this issue, we propose a generic framework to construct surrogate optimization problems, which supports efficient end-to-end training with deep learning. Moreover, our theoretical analyses show that: 1) the objective function of the surrogate problems will achieve an upper bound of the original problem under mild conditions, and 2) optimizing the surrogate problems leads to good generalization performance in terms of TPAUC with a high probability. Finally, empirical studies over several benchmark datasets speak to the efficacy of our framework.
翻译:ROC Curve (AUC) 下区域是机器学习的关键衡量标准,它评估了所有可能的真实正率和假正率的平均绩效。 此外,最近的工作表明,TPAUC基本上与现有的部分AUC衡量标准不一致,后者只限制FPR范围,因此,我们开始研究一个新问题,以寻求利用高PAUC的解决方案。受此驱动,我们在此文件中首次试验如何优化这一新度量度。这一过程中的关键挑战在于难以进行基于梯度的优化,最终进行精选培训,甚至正确选择了总正正正正正正正正正正正正正正正正正正正正的利率。