Electronic Health Records (EHR) have been heavily used in modern healthcare systems for recording patients' admission information to hospitals. Many data-driven approaches employ temporal features in EHR for predicting specific diseases, readmission times, or diagnoses of patients. However, most existing predictive models cannot fully utilize EHR data, due to an inherent lack of labels in supervised training for some temporal events. Moreover, it is hard for existing works to simultaneously provide generic and personalized interpretability. To address these challenges, we first propose a hyperbolic embedding method with information flow to pre-train medical code representations in a hierarchical structure. We incorporate these pre-trained representations into a graph neural network to detect disease complications, and design a multi-level attention method to compute the contributions of particular diseases and admissions, thus enhancing personalized interpretability. We present a new hierarchy-enhanced historical prediction proxy task in our self-supervised learning framework to fully utilize EHR data and exploit medical domain knowledge. We conduct a comprehensive set of experiments and case studies on widely used publicly available EHR datasets to verify the effectiveness of our model. The results demonstrate our model's strengths in both predictive tasks and interpretable abilities.
翻译:现代医疗体系大量使用电子健康记录(EHR)记录病人住院信息,许多由数据驱动的方法在EHR中采用时间特征来预测特定疾病、重新住院时间或诊断病人,然而,由于某些时间事件在监督培训中固有的缺乏标签,大多数现有预测模型无法充分利用EHR数据;此外,现有工作很难同时提供通用和个人化的解释;为了应对这些挑战,我们首先提议一种双曲嵌入方法,将信息流入信息流到分级结构中的培训前医疗代码表征;我们将这些预先培训的表征纳入图表神经网络,以检测疾病并发症,并设计一种多层次关注方法来计算特定疾病和住院的贡献,从而增强个性化解释能力;我们提出一个新的分级强化历史预测代理任务,以充分利用EHR数据并利用医疗领域知识;我们就广泛使用的电子人力资源数据集进行一套全面的实验和案例研究,以核实模型的有效性。结果展示了我们模型的预测能力。