Survival analysis, or time-to-event modelling, is a classical statistical problem that has garnered a lot of interest for its practical use in epidemiology, demographics or actuarial sciences. Recent advances on the subject from the point of view of machine learning have been concerned with precise per-individual predictions instead of population studies, driven by the rise of individualized medicine. We introduce here a conditional normalizing flow based estimate of the time-to-event density as a way to model highly flexible and individualized conditional survival distributions. We use a novel hierarchical formulation of normalizing flows to enable efficient fitting of flexible conditional distributions without overfitting and show how the normalizing flow formulation can be efficiently adapted to the censored setting. We experimentally validate the proposed approach on a synthetic dataset as well as four open medical datasets and an example of a common financial problem.
翻译:生存分析,即时间到活动建模,是一个典型的统计问题,在流行病学、人口学或精算科学的实际应用方面引起了很大的兴趣。从机器学习的角度来看,最近有关这一主题的进展涉及精确的每个个人预测,而不是由个性化医学的上升所推动的人口研究。我们在此采用有条件的基于时间到活动密度的正常流量估计,作为模拟高度灵活和个性化的有条件生存分布的一种方式。我们采用新的标准化流动的分级配方,以便能够有效地适应灵活的有条件分布,而不会过度适应,并表明正常化的流动配方如何能够有效地适应受审查的环境。我们实验验证了拟议的合成数据集方法和四个开放的医疗数据集以及共同财政问题的例子。