In this paper we consider how to evaluate survival distribution predictions with measures of discrimination. This is a non-trivial problem as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons. We find that the most robust method of reducing a distribution to a risk is to sum over the predicted cumulative hazard. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more transparent and accessible model evaluation.
翻译:在本文中,我们考虑如何用歧视措施来评价生存分配预测。这是一个非三重问题,因为歧视措施是生存分析中最常用的,但尚无明确的方法从分配预测中得出风险预测。我们调查文学和软件中提议的方法,并考虑它们各自的利弊。虽然分发方法经常用歧视措施来评价,但我们发现,在文献中很少说明这样做的方法,而且往往导致不公平的比较。我们发现,将分配减少至风险的最可靠方法是将预测的累积危险相加。我们建议机器学习生存分析软件在分配和风险预测之间进行明确的转换,以便进行更加透明和易于利用的模型评估。