Despite the abundance of Electronic Healthcare Records (EHR), its heterogeneity restricts the utilization of medical data in building predictive models. To address this challenge, we propose Universal Healthcare Predictive Framework (UniHPF), which requires no medical domain knowledge and minimal pre-processing for multiple prediction tasks. Experimental results demonstrate that UniHPF is capable of building large-scale EHR models that can process any form of medical data from distinct EHR systems. Our framework significantly outperforms baseline models in multi-source learning tasks, including transfer and pooled learning, while also showing comparable results when trained on a single medical dataset. To empirically demonstrate the efficacy of our work, we conducted extensive experiments using various datasets, model structures, and tasks. We believe that our findings can provide helpful insights for further research on the multi-source learning of EHRs.
翻译:尽管电子保健记录(EHR)丰富,但其多样性限制了在建立预测模型时使用医疗数据;为了应对这一挑战,我们提议建立全民保健预测框架(UniHPF),不需要医学领域知识,并且要为多重预测任务进行最低限度的预处理;实验结果显示,UniHPF有能力建立大规模的EHR模型,以便从不同的EHR系统中处理任何形式的医疗数据;我们的框架在多来源学习任务(包括转让和集合学习)方面大大优于基线模型,同时在就单一医疗数据集进行培训时显示可比结果;为了实证地证明我们的工作效力,我们利用各种数据集、模型结构和任务进行了广泛的实验;我们认为,我们的研究结果可以为进一步研究环境HR的多来源学习提供有益的见解。