With the development of natural language processing techniques(NLP), automatic diagnosis of eye diseases using ophthalmology electronic medical records (OEMR) has become possible. It aims to evaluate the condition of both eyes of a patient respectively, and we formulate it as a particular multi-label classification task in this paper. Although there are a few related studies in other diseases, automatic diagnosis of eye diseases exhibits unique characteristics. First, descriptions of both eyes are mixed up in OEMR documents, with both free text and templated asymptomatic descriptions, resulting in sparsity and clutter of information. Second, OEMR documents contain multiple parts of descriptions and have long document lengths. Third, it is critical to provide explainability to the disease diagnosis model. To overcome those challenges, we present an effective automatic eye disease diagnosis framework, NEEDED. In this framework, a preprocessing module is integrated to improve the density and quality of information. Then, we design a hierarchical transformer structure for learning the contextualized representations of each sentence in the OEMR document. For the diagnosis part, we propose an attention-based predictor that enables traceable diagnosis by obtaining disease-specific information. Experiments on the real dataset and comparison with several baseline models show the advantage and explainability of our framework.
翻译:由于开发了自然语言处理技术(NLP),利用眼科电子医疗记录(OEMR)对眼睛疾病进行自动诊断成为可能,目的是分别评估病人双眼的状况,我们在本文中将此作为特定的多标签分类任务。虽然对其他疾病进行了一些相关的研究,但眼病的自动诊断具有独特的特点。首先,在OEMR文件中将两只眼睛的描述混在一起,同时提供免费文本和模板性无症状描述,从而导致信息的宽广和杂乱。第二,OEMR文件包含描述的多个部分,文件长度很长。第三,为疾病诊断模式提供解释性至关重要。为了克服这些挑战,我们提出了一个有效的自动眼病诊断框架,WECHED。在这个框架内,一个预处理模块被整合,以提高信息的密度和质量。然后,我们设计一个等级变换结构,用于了解OEMR文件中每一句的背景描述,从而产生宽广和模糊的信息。关于诊断部分,我们建议以关注为基础的预测器,以便能够通过获取各种具体疾病的数据模型和优势模型进行可追溯性诊断。