The Electronic Health Record (EHR) is an essential part of the modern medical system and impacts healthcare delivery, operations, and research. Unstructured text is attracting much attention despite structured information in the EHRs and has become an exciting research field. The success of the recent neural Natural Language Processing (NLP) method has led to a new direction for processing unstructured clinical notes. In this work, we create a python library for clinical texts, EHRKit. This library contains two main parts: MIMIC-III-specific functions and tasks specific functions. The first part introduces a list of interfaces for accessing MIMIC-III NOTEEVENTS data, including basic search, information retrieval, and information extraction. The second part integrates many third-party libraries for up to 12 off-shelf NLP tasks such as named entity recognition, summarization, machine translation, etc.
翻译:电子健康记录(EHR)是现代医疗系统的一个基本部分,对医疗保健的提供、运作和研究产生影响。尽管电子健康记录(EHR)中的信息结构化,但无结构化文本仍然引起人们的极大关注,并已成为令人振奋的研究领域。最近神经自然语言处理(NLP)方法的成功为处理无结构化临床笔记带来了新的方向。在这项工作中,我们为临床文本创建了皮松图书馆(EHRKit)。该图书馆包括两个主要部分:MIMIMIC-III特定功能和任务具体功能。第一部分列出了获取MIMIC-III批注数据的界面清单,包括基本搜索、信息检索和信息提取。第二部分将许多第三方图书馆整合到12个非现出的NLP任务,如名称实体识别、汇总、机器翻译等。