Deep models trained through maximum likelihood have achieved state-of-the-art results for survival analysis. Despite this training scheme, practitioners evaluate models under other criteria, such as binary classification losses at a chosen set of time horizons, e.g. Brier score (BS) and Bernoulli log likelihood (BLL). Models trained with maximum likelihood may have poor BS or BLL since maximum likelihood does not directly optimize these criteria. Directly optimizing criteria like BS requires inverse-weighting by the censoring distribution, estimation of which itself also requires inverse-weighted by the failure distribution. But neither are known. To resolve this dilemma, we introduce Inverse-Weighted Survival Games to train both failure and censoring models with respect to criteria such as BS or BLL. In these games, objectives for each model are built from re-weighted estimates featuring the other model, where the re-weighting model is held fixed during training. When the loss is proper, we show that the games always have the true failure and censoring distributions as a stationary point. This means models in the game do not leave the correct distributions once reached. We construct one case where this stationary point is unique. We show that these games optimize BS on simulations and then apply these principles on real world cancer and critically-ill patient data.
翻译:通过最大可能性培训的深层模型已经达到了最新的生存分析结果。尽管有这一培训计划,实践者还是根据其他标准评估模型,如在选定的时间范围,如Brier评分(BS)和Bernoulli日志(BLL),在选定的时间范围,如Brier评分(BLL)和Binoulli日志(BLL)的二进分分类损失。在最大可能性培训的模型可能没有BS或BLL,因为最大可能性不能直接优化这些标准。像BS这样的直接优化标准需要通过审查分布进行反加权,对BS的估算本身也需要通过失败分布进行反加权。但两者都不清楚。为了解决这一困境,我们引入反视生存运动会来培训失败和审查模型,以BS或BLL等标准为对象。在这些游戏中,每种模型的目标可能来自重加权的估计数,显示其他模型在培训期间固定的。当损失正确的时候,我们证明游戏总是有真正的失败和审查分布作为固定点。这意味着游戏中的模型不会留下正确的真实分布,一旦达到B原则,我们就在这种模型中,我们用一个案例来展示了这些模型。我们用这个模型,然后在了。