Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research. Despite this central role, EHRs are notoriously difficult to process automatically. Well over half of the information stored within EHRs is in the form of unstructured text (e.g. provider notes, operation reports) and remains largely untapped for secondary use. Recently, however, newer neural network and deep learning approaches to Natural Language Processing (NLP) have made considerable advances, outperforming traditional statistical and rule-based systems on a variety of tasks. In this survey paper, we summarize current neural NLP methods for EHR applications. We focus on a broad scope of tasks, namely, classification and prediction, word embeddings, extraction, generation, and other topics such as question answering, phenotyping, knowledge graphs, medical dialogue, multilinguality, interpretability, etc.
翻译:电子健康记录(EHRs),病人保健事件和观察的数字收集,在医学上无处不在,对保健的提供、运作和研究至关重要。尽管具有这一中心作用,但EHR是难以自动处理的。在EHR中储存的信息有一半以上是非结构化的文本(例如提供者的说明、操作报告),大部分尚未供二次使用。然而,最近,新的神经网络和深入学习的自然语言处理方法(NLP)在各种任务方面取得了长足的进步,优于传统的统计和基于规则的系统。在本调查文件中,我们总结了目前用于EHR应用的NNLP神经系统方法。我们侧重于广泛的任务范围,即分类和预测、单词嵌入、提取、生成和其他专题,例如问题解答、口述、知识图表、医疗对话、多语言性、可解释性等等。