Effective learning from electronic health records (EHR) data for prediction of clinical outcomes is often challenging because of features recorded at irregular timesteps and loss to follow-up as well as competing events such as death or disease progression. To that end, we propose a generative time-to-event model, SurvLatent ODE, which adopts an Ordinary Differential Equation-based Recurrent Neural Networks (ODE-RNN) as an encoder to effectively parameterize dynamics of latent states under irregularly sampled input data. Our model then utilizes the resulting latent embedding to flexibly estimate survival times for multiple competing events without specifying shapes of event-specific hazard function. We demonstrate competitive performance of our model on MIMIC-III, a freely-available longitudinal dataset collected from critical care units, on predicting hospital mortality as well as the data from the Dana-Farber Cancer Institute (DFCI) on predicting onset of Venous Thromboembolism (VTE), a life-threatening complication for patients with cancer, with death as a competing event. SurvLatent ODE outperforms the current clinical standard Khorana Risk scores for stratifying VTE risk groups, while providing clinically meaningful and interpretable latent representations.
翻译:从电子健康记录(EHR)数据中有效学习用于预测临床结果的数据往往具有挑战性,原因是在不规则的时间步和损失中记录了一些特征,无法跟踪,以及诸如死亡或疾病演变等相互竞争的事件。为此,我们建议采用一个基因化的时间到活动模型SurvLatent CODE,SurvLatent CODE,该模型采用基于普通差异的分数的经常性神经网络(ODE-RNNN)作为编码器,以有效地将非正常抽样输入的数据中潜在国家动态的参数化为参数。我们的模型随后利用由此产生的潜嵌入以灵活估计多种竞合事件的生存时间,而不具体说明特定事件的危险功能的形状。我们展示了我们在MIMIC-III模型上的竞争性性表现,即从关键护理单位收集的可自由获取的纵向数据集,即关于预测医院死亡率的可自由获取性纵向数据,以及Dana-Farber癌症研究所(VTE)关于预测Venous Stenbomblomemlolis(VTE)的、对生命有威胁的并发症,以及死亡为相互竞争事件的数据。SurvLATINOD超越了当前临床标准风险分数的临床风险分数。