Objective: Social Determinants of Health (SDOH) influence personal health outcomes and health systems interactions. Health systems capture SDOH information through structured data and unstructured clinical notes; however, clinical notes often contain a more comprehensive representation of several key SDOH. The objective of this work is to assess the SDOH information gain achievable by extracting structured semantic representations of SDOH from the clinical narrative and combining these extracted representations with available structured data. Materials and Methods: We developed a natural language processing (NLP) information extraction model for SDOH that utilizes a deep learning entity and relation extraction architecture. In an electronic health record (EHR) case study, we applied the SDOH extractor to a large existing clinical data set with over 200,000 patients and 400,000 notes and compared the extracted information with available structured data. Results: The SDOH extractor achieved 0.86 F1 on a withheld test set. In the EHR case study, we found 19\% of current tobacco users, 10\% of drug users, and 32\% of homeless patients only include documentation of these risk factors in the clinical narrative. Conclusions: Patients who are at-risk for negative health outcomes due to SDOH may be better served if health systems are able to identify SDOH risk factors and associated social needs. Structured semantic representations of text-encoded SDOH information can augment existing structured, and this more comprehensive SDOH representation can assist health systems in identifying and addressing social needs.
翻译:健康的社会决定因素(SDOH) 健康的社会决定因素(SDOH) 健康的社会决定因素(SDOH) 影响个人健康结果和卫生系统互动; 卫生系统通过结构化数据和无结构化临床说明收集SDOH信息; 然而,临床说明往往包含若干关键的SDOH的更全面的代表性; 这项工作的目的是评估SDOH信息通过从临床叙述中提取SDOH结构化的语义表达方式并把这些提取的表述方式与现有的结构化数据结合起来而获得的。 材料和方法:我们为SDOH开发了一个天然语言处理(NLP)信息提取模式,该模式使用一个深层次学习实体和关系提取结构。 在一项电子健康记录(EHR)案例研究中,我们将SDOH抽取的样本应用到一个现有大型临床数据(20多万个病人和40万个笔记注),并将抽取的信息与现有结构化的数据进行比较。 在EHR案例研究中,我们发现目前吸烟使用者、毒品使用者的10个和无家可归病人的32个只将记录这些风险因素纳入临床说明中。 。结论:由于SDOHODODSD(SD) 和SDODSDSDSDSD(SD) AS(SD) AS(SD) AS) 能够更好地识别(SDDD) 结构的增值) 所需的危险识别) 和增加的统计(SD(SDB) 需要更能满足。