Objective: Evictions are involved in a cascade of negative events that can lead to unemployment, homelessness, long-term poverty, and mental health problems. In this study, we developed a natural language processing system to automatically detect eviction incidences and their attributes from electronic health record (EHR) notes. Materials and Methods: We annotated eviction status in 5000 EHR notes from the Veterans Health Administration. We developed a novel model, called Knowledge Injection based on Ripple Effects of Social and Behavioral Determinants of Health (KIRESH), that has shown to substantially outperform other state-of-the-art models such as fine-tuning pre-trained language models like BioBERT and Bio_ClinicalBERT. Moreover, we designed a prompt to further improve the model performance by using the intrinsic connection between the two sub-tasks of eviction presence and period prediction. Finally, we used the Temperature Scaling-based Calibration on our KIRESH-Prompt method to avoid over-confidence issues arising from the imbalance dataset. Results: KIRESH-Prompt achieved a Macro-F1 of 0.6273 (presence) and 0.7115 (period), which was significantly higher than 0.5382 (presence) and 0.67167 (period) for just fine-tuning Bio_ClinicalBERT model. Conclusion and Future Work: KIRESH-Prompt has substantially improved eviction status classification. In future work, we will evaluate the generalizability of the model framework to other applications.
翻译:目标:驱逐涉及一系列可能导致失业、无家可归、长期贫穷和心理健康问题的负面事件。在这项研究中,我们开发了一个天然语言处理系统,自动检测出驱逐事件及其从电子健康记录(EHR)注释中产生的属性。材料和方法:我们在退伍军人健康管理局的5000份EHR说明中附加说明的驱逐状况。我们开发了一个新颖的模式,名为知识注射,其基础是社会和行为健康决定因素的波纹效应(KIRESH),该模式大大优于其他最先进的模式,如BioBERT和Bio_ClinicBERT等经过预先训练的语言模型的微调。此外,我们设计了一个快速的改进模型性能,利用了驱逐存在和周期预测两个子任务之间的内在联系。最后,我们用基于温度的缩缩缩缩法对我们的KIRESH-Provial Inmpt 方法避免了不平衡数据分类引起的过度信任问题。结果:KIRESH-Propliniz Propressionality 115 (Bregal-Breality) a mission-F1, 0.6271 (Preal Stud State Studal Studal Studal)和Bs) 0.7271(B)的工作大幅改进了0.7-B)和0.7-Bs)