Areas under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems. Compared with AUROC, AUPRC is a more appropriate metric for highly imbalanced datasets. While stochastic optimization of AUROC has been studied extensively, principled stochastic optimization of AUPRC has been rarely explored. In this work, we propose a principled technical method to optimize AUPRC for deep learning. Our approach is based on maximizing the averaged precision (AP), which is an unbiased point estimator of AUPRC. We cast the objective into a sum of {\it dependent compositional functions} with inner functions dependent on random variables of the outer level. We propose efficient adaptive and non-adaptive stochastic algorithms with {\it provable convergence guarantee under mild conditions} by leveraging recent advances in stochastic compositional optimization. Extensive experimental results on image and graph datasets demonstrate that our proposed method outperforms prior methods on imbalanced problems in terms of AUPRC. To the best of our knowledge, our work represents the first attempt to optimize AUPRC with provable convergence.
翻译:ROC (AUROC) 和 精确回调曲线 (AURC) 下的领域是评估不平衡问题分类性能的通用指标。 与 AUROC 相比, AUPRC 是高度不平衡数据集的更适当指标。 虽然对AUROC 的随机优化进行了广泛研究,但很少探索AURC 的有原则的随机优化。 在这项工作中,我们提出了一个优化 AURC 的深层学习的原则性技术方法。 我们的方法基于尽可能扩大平均精确度(AP),这是AUPRC 的公正点估测器。 我们把目标化成一个由内函数组成的总和,而内函数则取决于外部的随机变量。 我们提出在温和条件下,以可辨识的组合保证为主的高效适应性和非适应性可调和性演算法 。 在图像和图形数据集方面,我们提出的广泛实验结果表明,我们所提议的方法在AUPRRC 方面比先前处理不平衡问题的方法要好。 为了最佳的趋同性,我们的工作是试图以最优化的方式。