Objective: To combine medical knowledge and medical data to interpretably predict the risk of disease. Methods: We formulated the disease prediction task as a random walk along a knowledge graph (KG). Specifically, we build a KG to record relationships between diseases and risk factors according to validated medical knowledge. Then, a mathematical object walks along the KG. It starts walking at a patient entity, which connects the KG based on the patient current diseases or risk factors and stops at a disease entity, which represents the predicted disease. The trajectory generated by the object represents an interpretable disease progression path of the given patient. The dynamics of the object are controlled by a policy-based reinforcement learning (RL) module, which is trained by electronic health records (EHRs). Experiments: We utilized two real-world EHR datasets to evaluate the performance of our model. In the disease prediction task, our model achieves 0.743 and 0.639 in terms of macro area under the curve (AUC) in predicting 53 circulation system diseases in the two datasets, respectively. This performance is comparable to the commonly used machine learning (ML) models in medical research. In qualitative analysis, our clinical collaborator reviewed the disease progression paths generated by our model and advocated their interpretability and reliability. Conclusion: Experimental results validate the proposed model in interpretably evaluating and optimizing disease prediction. Significance: Our work contributes to leveraging the potential of medical knowledge and medical data jointly for interpretable prediction tasks.
翻译:目标:将医疗知识和医疗数据结合起来,以解释地预测疾病的风险。方法:我们把疾病预测任务作为随机地沿着知识图表(KG)进行。具体地说,我们建立了一个KG,以根据经过验证的医疗知识记录疾病和风险因素之间的关系。然后,一个数学对象沿着KG走。开始在一个病人实体行走,该实体根据病人当前疾病或风险因素将KG连接起来,并在代表预测疾病的疾病的一个疾病实体停留。该对象产生的轨迹代表了给定病人可以解释的疾病演变路径。该对象的动态由基于政策的强化学习(RL)模块控制,该模块接受电子健康记录(EHRs)的培训。实验:我们利用两个现实世界的EHR数据集来评估我们模型的性能。在疾病预测任务中,我们模型在曲线下,在预测53种可解释的循环系统疾病方面分别达到0.743和0.639。这一性表现与医学研究中常用的机器学习模型(ML)相似,该模型由电子健康记录(EHRs)来控制。实验:我们利用两种真实的临床预测性分析、实验性分析、实验性解释我们的数据分析结果,从而推算出我们的数据推算结果。