Automated medical coding, an essential task for healthcare operation and delivery, makes unstructured data manageable by predicting medical codes from clinical documents. Recent advances in deep learning models in natural language processing have been widely applied to this task. However, it lacks a unified view of the design of neural network architectures for medical coding. This review proposes a unified framework to provide a general understanding of the building blocks of medical coding models and summarizes recent advanced models under the proposed framework. Our unified framework decomposes medical coding into four main components, i.e., encoder modules for text feature extraction, mechanisms for building deep encoder architectures, decoder modules for transforming hidden representations into medical codes, and the usage of auxiliary information. Finally, we discuss key research challenges and future directions.
翻译:自动化医疗编码是医疗保健业务和交付的一项基本任务,它通过预测临床文件的医疗编码,使数据结构不健全,使数据可以管理。自然语言处理的深层学习模式最近的进展已广泛应用于这项任务。然而,它缺乏对医疗编码神经网络结构设计的统一看法。审查提出一个统一框架,以使人们普遍了解医疗编码模式的构件,并总结拟议框架下的最新先进模型。我们的统一框架将医疗编码分解成四个主要组成部分,即文字特征提取编码模块、建立深层编码结构的机制、将隐藏的表述转化为医疗编码的解码模块以及辅助信息的使用。最后,我们讨论了关键的研究挑战和今后的方向。