Effective learning from electronic health records (EHR) data for prediction of clinical outcomes is often challenging because of features recorded at irregular timesteps and loss to follow-up as well as competing events such as death or disease progression. To that end, we propose a generative time-to-event model, SurvLatent ODE, which adopts an Ordinary Differential Equation-based Recurrent Neural Networks (ODE-RNN) as an encoder to effectively parameterize a latent representation under irregularly sampled data. Our model then utilizes the latent representation to flexibly estimate survival times for multiple competing events without specifying shapes of event-specific hazard function. We demonstrate competitive performance of our model on MIMIC-III, a freely-available longitudinal dataset collected from critical care units, on predicting hospital mortality as well as the data from the Dana-Farber Cancer Institute (DFCI) on predicting onset of Deep Vein Thrombosis (DVT), a life-threatening complication for patients with cancer, with death as a competing event. SurvLatent ODE outperforms the current clinical standard Khorana Risk scores for stratifying DVT risk groups.
翻译:从电子健康记录(EHR)数据中有效学习用于预测临床结果的数据往往具有挑战性,原因是在不规则的时步上记录了一些特征,在跟踪方面损失了,以及诸如死亡或疾病不断演变等相互竞争的事件。为此,我们提出一个基因化的时间到活动模型SurvLatent CODE,该模型采用基于普通差异的经常性神经网络(ODE-RNN)作为编码器,以在不定期抽样数据下对潜在代表性进行有效参数的参数。我们当时的模型利用潜在代表来灵活估计多种竞争事件的生存时间,而不具体说明特定事件的危险功能的形状。我们展示了我们在MIMIC-III上的模式的竞争性表现,即从关键护理单位收集的可自由获取的长视数据集,在预测医院死亡率方面,以及从Dana-Farber癌症研究所(DUNA-Farber癌症研究所)获得的数据,预测Deep Vein EXMONBis(DVT)的开始发生危及生命的并发症,而死亡则是相互竞争的事件。SurvLatent ODECODED超越了当前临床标准水平风险等级。