In recent years, estimating the duration of medical intervention based on electronic health records (EHRs) has gained significant attention in the filed of clinical decision support. However, current models largely focus on structured data, leaving out information from the unstructured clinical free-text data. To address this, we present a novel language-enhanced transformer-based framework, which projects all relevant clinical data modalities (continuous, categorical, binary, and free-text features) into a harmonized language latent space using a pre-trained sentence encoder with the help of medical prompts. The proposed method enables the integration of information from different modalities within the cell transformer encoder and leads to more accurate duration estimation for medical intervention. Our experimental results on both US-based (length of stay in ICU estimation) and Asian (surgical duration prediction) medical datasets demonstrate the effectiveness of our proposed framework, which outperforms tailored baseline approaches and exhibits robustness to data corruption in EHRs.
翻译:近年来,基于电子健康记录(EHRs)的医疗干预持续时间估计已经引起了临床决策支持领域的重视。然而,当前的模型主要关注结构化的数据,忽略了来自无结构化的临床文本数据的信息。为解决这个问题,我们提出了一种新颖的语言增强变压器框架,使用预训练的句子编码器和医学提示,将所有相关的临床数据模态(连续、分类、二进制和自由文本特征)投影到一个协调的语言潜在空间中。所提出的方法在单元变压器编码器中实现了不同模态信息的集成,从而提高了医疗干预的持续时间估计准确性。我们在美国(估计ICU住院天数)和亚洲(手术持续时间预测)的医疗数据集上的实验结果表明,所提出的框架具有很好的效果,优于定制的基准方法,并且对EHRs中的数据损坏表现出了鲁棒性。