Predicting the health risks of patients using Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Health risk refers to the probability of the occurrence of a specific health outcome for a specific patient. The predicted risks can be used to support decision-making by healthcare professionals. EHRs are structured patient journey data. Each patient journey contains a chronological set of clinical events, and within each clinical event, there is a set of clinical/medical activities. Due to variations of patient conditions and treatment needs, EHR patient journey data has an inherently high degree of missingness that contains important information affecting relationships among variables, including time. Existing deep learning-based models generate imputed values for missing values when learning the relationships. However, imputed data in EHR patient journey data may distort the clinical meaning of the original EHR patient journey data, resulting in classification bias. This paper proposes a novel end-to-end approach to modeling EHR patient journey data with Integrated Convolutional and Recurrent Neural Networks. Our model can capture both long- and short-term temporal patterns within each patient journey and effectively handle the high degree of missingness in EHR data without any imputation data generation. Extensive experimental results using the proposed model on two real-world datasets demonstrate robust performance as well as superior prediction accuracy compared to existing state-of-the-art imputation-based prediction methods.
翻译:预测使用电子健康记录(EHR)的病人的健康风险近年来引起了相当大的注意,特别是随着深层次学习技术的发展,健康风险是指特定病人出现特定健康结果的概率。预测的风险可用于支持保健专业人员的决策。EHR是病人出行的结构化数据。每次病人出行都包含一系列按时间顺序排列的临床事件,在每次临床事件中,都有一套临床/医疗活动。由于病人的条件和治疗需要的不同,EHR病人出行数据本身就具有高度的缺失,包含影响包括时间在内的各种变数之间关系的重要信息。现有的深层次学习模型在学习关系时产生缺失值的估算值。然而,EHR病人出行数据中的估算数据可能会扭曲EHR病人出行数据的临床含义,导致分类偏差。本文提出一套新的端对端方法,用基于综合变迁和复式神经网络的病人出历程数据模拟EHR的病人出程数据。我们的模型可以捕捉每次病人出包括时间和短时程的重要模式,影响包括时间的关系。现有的深层次基于学习的模型在学习关系中产生缺失的数值时,在了解关系时生成时,有效地处理缺失值缺失值的估算缺失值值的数值的数值的数值。但是,在EHR的现有数据中,不使用高端数据中,可以有效地分析现有数据,将现有预测数据将快速数据作为全球数据模拟,用以模拟,用以模拟模拟,将快速生成数据,不测算出。