Incremental value (IncV) evaluates the performance improvement from an existing risk model to a new model. In this paper, we compare the IncV of the area under the receiver operating characteristic curve (IncV-AUC) and the IncV of the area under the precision-recall curve (IncV-AP). Since they are both semi-proper scoring rules, we also compare them with a strictly proper scoring rule: the IncV of the scaled Brier score (IncV-sBrS). The comparisons are demonstrated via a numerical study under various event rates. The results show that the IncV-AP and IncV-sBrS are highly consistent, but the IncV-AUC and the IncV-sBrS are negatively correlated at a low event rate. The IncV-AUC and IncV-AP are the least consistent among the three pairs, and their differences are more pronounced as the event rate decreases. To investigate this phenomenon, we derive the expression of these two metrics. Both are weighted averages of the changes (from the existing model to the new one) in the separation of the risk score distributions between events and non-events. However, the IncV-AP assigns heavier weights to the changes in the higher risk group, while the IncV-AUC weighs the entire population equally. We further illustrate this point via a data example of two risk models for predicting acute ovarian failure. The new model has a slightly lower AUC but increases the AP by 48%. We conclude that when the objective is to identify the high-risk group, the IncV-AP is a more appropriate metric, especially when the event rate is low.
翻译:递增值 (IncV) 评估从现有风险模型到新模型的性能改进。 在本文中, 我们比较了接收器操作特征曲线( IncV- AUC)下区域 IncV 和精确回调曲线( IncV- AP)下区域 IncV 的IncV 。 由于两者都是半偏差的评分规则, 我们还比较了它们与严格适当的评分规则: 按比例比值( IncV- sBrS) 的IncV 。 比较通过对各种事件率进行的数字研究来显示。 结果表明, IncV- AP 和 IncV- SBRS 在接收接收器操作特征曲线( IncV- AUC) 下区域 IncV- AUC 和 IncV- SBRS 下区域 IncVV 的 IncVVV, 和 IncV- AV- URS 以低比值计算整个风险分数( 从现有模型到新模型) 的加权值平均值的加权平均值, 当我们对A- VC 的数值进行更大幅度的模型时, 当我们的数值比重的模型的数值比重的数值比重的数值比重时, 和不比重的数值比重的模型的数值比重的数值比重。