关于图像编码在自动Chest X光报告生成过程中的重要性 (On the Importance of Image Encoding in Automated Chest X-Ray Report Generation)

Chest X-ray is one of the most popular medical imaging modalities due to its accessibility and effectiveness. However, there is a chronic shortage of well-trained radiologists who can interpret these images and diagnose the patient's condition. Therefore, automated radiology report generation can be a very helpful tool in clinical practice. A typical report generation workflow consists of two main steps: (i) encoding the image into a latent space and (ii) generating the text of the report based on the latent image embedding. Many existing report generation techniques use a standard convolutional neural network (CNN) architecture for image encoding followed by a Transformer-based decoder for medical text generation. In most cases, CNN and the decoder are trained jointly in an end-to-end fashion. In this work, we primarily focus on understanding the relative importance of encoder and decoder components. Towards this end, we analyze four different image encoding approaches: direct, fine-grained, CLIP-based, and Cluster-CLIP-based encodings in conjunction with three different decoders on the large-scale MIMIC-CXR dataset. Among these encoders, the cluster CLIP visual encoder is a novel approach that aims to generate more discriminative and explainable representations. CLIP-based encoders produce comparable results to traditional CNN-based encoders in terms of NLP metrics, while fine-grained encoding outperforms all other encoders both in terms of NLP and clinical accuracy metrics, thereby validating the importance of image encoder to effectively extract semantic information. GitHub repository: https://github.com/mudabek/encoding-cxr-report-gen

翻译：切斯特X光是一种最受欢迎的医学成像模式之一,因为它的可访问性和有效性。然而,长期缺乏训练有素的放射学家,他们可以解释这些图像并诊断病人的状况。因此,自动化放射报告生成可以成为临床实践中非常有用的工具。典型的报告生成工作流程包括两个主要步骤:(一) 将图像编码成一个隐形空间,以及(二) 根据隐形图像嵌入生成报告文本。许多现有的报告生成技术使用标准化的同源神经网络(CNN)结构进行图像编码,然后是基于变压器的解码器用于医学文本生成。在大多数情况下,CNNP和解码器的生成可能是临床实践过程中非常有用的工具。为此,我们主要侧重于理解编码器和解码组件的相对重要性。为此,我们分析了四种不同的图像编码方法:直接的、精细的、基于CLIP的、基于CLIP的、以及基于CRIP的常规编码的图像编码结构,与三个不同的解算器一起,用于大规模IMIC-Cubc-R的解码的解码器生成,同时,在可比较的SilderLdal-Ldealdeal dealal Produal 方法中,这些CLdealdeal 和制成为CLdeal-dealdaldaldaldaldaldaldaldaldaldaldaldaldal 。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日