This paper proposes a deep learning-based method to identify the segments of a clinical note corresponding to ICD-9 broad categories which are further color-coded with respect to 17 ICD-9 categories. The proposed Medical Segment Colorer (MSC) architecture is a pipeline framework that works in three stages: (1) word categorization, (2) phrase allocation, and (3) document classification. MSC uses gated recurrent unit neural networks (GRUs) to map from an input document to word multi-labels to phrase allocations, and uses statistical median to map phrase allocation to document multi-label. We compute variable length segment coloring from overlapping phrase allocation probabilities. These cross-level bidirectional contextual links identify adaptive context and then produce segment coloring. We train and evaluate MSC using the document labeled MIMIC-III clinical notes. Training is conducted solely using document multi-labels without any information on phrases, segments, or words. In addition to coloring a clinical note, MSC generates as byproducts document multi-labeling and word tagging -- creation of ICD9 category keyword lists based on segment coloring. Performance comparison of MSC byproduct document multi-labels versus methods whose purpose is to produce justifiable document multi-labels is 64% vs 52.4% micro-average F1-score against the CAML (CNN attention multi label) method. For evaluation of MSC segment coloring results, medical practitioners independently assigned the colors to broad ICD9 categories given a sample of 40 colored notes and a sample of 50 words related to each category based on the word tags. Binary scoring of this evaluation has a median value of 83.3% and mean of 63.7%.
翻译:本文建议了一种深层次的学习方法,用以确定与ICD-9广泛类别相对的临床说明部分,这些分类在17 ICD-9类中进一步以颜色编码。提议的医疗部分颜色(MSC)架构是一个管道框架,分为三个阶段:(1) 单词分类,(2) 短语分配,(3) 文件分类。MSC使用封闭式的经常性单元神经网络(GRUs)从输入文档到多标签,将输入文档的多标签从一个输入文档映射到多标签,并使用统计中位中位来将短语分配到文档的多标签。我们从重叠的短语分配概率中计算出不同长度部分的颜色。这些跨级别双向背景链接识别适应性环境,然后生成部分颜色颜色颜色。我们用标记为 MIMIMIC-III 临床说明文件进行训练和评价。除了给临床说明的颜色外,MSC还以副产品 CD-40 类的值定义值列表,根据分颜色分配的分级配置。 MICSB-ral-ral 的每类中值排序中,MSB-ral-ral-ral-ral-ral-ral-lation-lation CA-leverviewxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 以独立的注意一个独立的注意一个独立的多级值。