Harrel's concordance index is a commonly used discrimination metric for survival models, particularly for models where the relative ordering of the risk of individuals is time-independent, such as the proportional hazards model. There are several suggestions, but no consensus, on how it could be extended to models where relative risk can vary over time, e.g.\ in case of crossing hazard rates. We show that these concordance indices are not proper, in the sense that they are maximised in the limit by the true data generating model. Furthermore, we show that a concordance index is proper if and only if the risk score used is concordant with the hazard rate at the first event time for each comparable pair of events. Thus, we suggest using the hazard rate as the time-varying risk score when calculating concordance. Through simulations, we demonstrate situations in which other concordance indices can lead to incorrect models being selected over a true model, justifying the use of our suggested risk prediction in both model selection and in loss functions in, e.g., deep learning models.
翻译:暂无翻译