International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding. These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information. The problem of automatically assigning ICD codes has been approached in literature as a multilabel classification, using neural models on unstructured data. Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles. Furthermore, we exploit the geometric properties of the two sets of word vectors and combine them into a common dimensional space, using meta-embedding techniques. We demonstrate the efficacy of this approach for a multimodal setting, using unstructured and structured information. We empirically show that our approach improves the current state-of-the-art deep learning architectures and benefits ensemble models.
翻译:国际疾病分类(疾病分类)是全球用于临床编码的事实上的代码,这些代码使保健提供者能够要求偿还费用,便利有效储存和检索诊断信息;文献中已经将自动分配疾病编码的问题作为多标签分类处理,使用无结构数据的神经模型;我们提出的方法通过利用常规医疗数据和科学文章的外部知识有效培训文字矢量,提高了神经模型的性能;此外,我们利用两套文字矢量的几何特性,利用元组合技术,将它们结合到一个共同的维体空间;我们用非结构化和结构化的信息,展示了这种多式设置方法的有效性;我们从经验上表明,我们的方法改善了目前最先进的深层次学习结构和效益共同模型。