Healthcare is becoming a more and more important research topic recently. With the growing data in the healthcare domain, it offers a great opportunity for deep learning to improve the quality of medical service. However, the complexity of electronic health records (EHR) data is a challenge for the application of deep learning. Specifically, the data produced in the hospital admissions are monitored by the EHR system, which includes structured data like daily body temperature, and unstructured data like free text and laboratory measurements. Although there are some preprocessing frameworks proposed for specific EHR data, the clinical notes that contain significant clinical value are beyond the realm of their consideration. Besides, whether these different data from various views are all beneficial to the medical tasks and how to best utilize these data remain unclear. Therefore, in this paper, we first extract the accompanying clinical notes from EHR and propose a method to integrate these data, we also comprehensively study the different models and the data leverage methods for better medical task prediction. The results on two medical prediction tasks show that our fused model with different data outperforms the state-of-the-art method that without clinical notes, which illustrates the importance of our fusion method and the value of clinical note features. Our code is available at https: //github.com/emnlp-mimic/mimic.
翻译:最近,保健正在成为一个越来越重要的研究课题。随着保健领域数据不断增长,它为深入学习以提高医疗服务质量提供了极好的机会。然而,电子保健记录数据的复杂性是应用深层学习的一个挑战。具体地说,医院住院治疗中产生的数据由保健保健系统监测,该系统包括结构化的数据,如每日体温,以及免费文本和实验室测量等非结构化数据。虽然为具体的EHR数据提出了一些预处理框架,但含有重要临床价值的临床说明超出了它们考虑的范围。此外,这些不同观点的数据是否都有益于医疗任务,以及如何最好地利用这些数据,这一点仍然不清楚。因此,在本文件中,我们首先从EHR中提取所附的临床说明,并提出将这些数据综合的方法,我们还全面研究不同的模型和数据杠杆方法,以更好地进行医疗任务预测。两个医学预测任务的结果显示,我们具有不同数据的集成模型,超越了没有临床说明的状态-艺术方法,而没有临床说明,说明我们混合方法/临床说明的重要性。在http/http://mimimimimic。