半自我监督的 ICD 自动自动编码 (Semi-self-supervised Automated ICD Coding)

Clinical Text Notes (CTNs) contain physicians' reasoning process, written in an unstructured free text format, as they examine and interview patients. In recent years, several studies have been published that provide evidence for the utility of machine learning for predicting doctors' diagnoses from CTNs, a task known as ICD coding. Data annotation is time consuming, particularly when a degree of specialization is needed, as is the case for medical data. This paper presents a method of augmenting a sparsely annotated dataset of Icelandic CTNs with a machine-learned imputation in a semi-self-supervised manner. We train a neural network on a small set of annotated CTNs and use it to extract clinical features from a set of un-annotated CTNs. These clinical features consist of answers to about a thousand potential questions that a physician might find the answers to during a consultation of a patient. The features are then used to train a classifier for the diagnosis of certain types of diseases. We report the results of an evaluation of this data augmentation method over three tiers of data availability to the physician. Our data augmentation method shows a significant positive effect which is diminished when clinical features from the examination of the patient and diagnostics are made available. We recommend our method for augmenting scarce datasets for systems that take decisions based on clinical features that do not include examinations or tests.

翻译：临床文本说明(CTNs) 包含医生的推理过程,以非结构化的自由文本格式编写,用于检查和采访病人。近年来,出版了一些研究报告,为机器学习用于预测医生从CTNs(即称为 ICD 编码的任务)的诊断提供了证据。数据注释耗时,特别是在需要某种程度的专业化时,如医疗数据。本文件展示了一种方法,用半自动监督的方式,用机器收集的半自动检测的方式,增加冰岛CTNs的少量附加说明的数据集。我们用一套小的附加说明的CTNs来训练神经网络,并用它从一组未经附加说明的CTNs中提取临床特征。这些临床特征包括回答医生在咨询病人时可能找到答案的大约一千个潜在问题。然后,这些特征被用来训练分类员诊断某些类型的疾病。我们向医生报告这一数据增强方法的评估结果,该方法超过三个层次的数据提供率。我们的数据增强方法在临床测试中显示,在临床测试时,我们的现有诊断方法能够增加一个显著的积极效果。我们的数据增强方法在临床测试中显示,在临床测试时,我们现有的诊断方法可以增强一个显著的积极效果。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日