The ROC curve is widely used to assess the quality of prediction/classification/ranking algorithms, and its properties have been extensively studied. The precision-recall (PR) curve has become the de facto replacement for the ROC curve in the presence of imbalance, namely where one class is far more likely than the other class. While the PR and ROC curves tend to be used interchangeably, they have some very different properties. Properties of the PR curve are the focus of this paper. We consider: (1) population PR curves, where complete distributional assumptions are specified for scores from both classes; and (2) empirical estimators of the PR curve, where we observe scores and no distributional assumptions are made. The properties have direct consequence on how the PR curve should, and should not, be used. For example, the empirical PR curve is not consistent when scores in the class of primary interest come from discrete distributions. On the other hand, a normal approximation can fit quite well for points on the empirical PR curve from continuously-defined scores, but convergence can be heavily influenced by the distributional setting, the amount of imbalance, and the point of interest on the PR curve.
翻译:ROC曲线被广泛用于评估预测/分类/等级算法的质量,其特性已得到广泛研究;精确回调(PR)曲线已成为在不平衡的情况下实际取代ROC曲线的情况,即某一类比其他类别更有可能使用;虽然PR曲线和ROC曲线往往可以互换使用,但具有一些非常不同的属性;PR曲线的属性是本文件的重点;我们认为:(1)人口PR曲线,其中对两个类别的得分都规定了完整的分布假设;和(2)PR曲线的经验性估计者,其中我们观察得分,但没有作出分配假设;这些特性直接影响到如何使用PR曲线,而不应该使用。例如,当主要利益类别的分数来自离散分布时,经验性PR曲线的分数不一致。另一方面,正常的近似值对于经验性PR曲线中由持续界定的得分数得出的点相当合适,但趋同会受到分布定位、不平衡程度和PR曲线上利益点的严重影响。