Automated generation of clinically accurate radiology reports can improve patient care. Previous report generation methods that rely on image captioning models often generate incoherent and incorrect text due to their lack of relevant domain knowledge, while retrieval-based attempts frequently retrieve reports that are irrelevant to the input image. In this work, we propose Contrastive X-Ray REport Match (X-REM), a novel retrieval-based radiology report generation module that uses an image-text matching score to measure the similarity of a chest X-ray image and radiology report for report retrieval. We observe that computing the image-text matching score with a language-image model can effectively capture the fine-grained interaction between image and text that is often lost when using cosine similarity. X-REM outperforms multiple prior radiology report generation modules in terms of both natural language and clinical metrics. Human evaluation of the generated reports suggests that X-REM increased the number of zero-error reports and decreased the average error severity compared to the baseline retrieval approach. Our code is available at: https://github.com/rajpurkarlab/X-REM
翻译:自动化生成临床准确的放射学报告可以改善患者护理。以图像字幕模型为依据的先前报告生成方法,由于缺乏相关领域知识,往往会生成不连贯和不正确的文本,而基于检索的尝试经常检索与输入图像无关的报告。在本文中,我们提出了一种名为对比 X 射线报告匹配 (X-REM) 的新型基于检索的放射学报告生成模块,该模块使用图像文本匹配分数来衡量胸部 X 射线图像和放射学报告的相似性以进行报告检索。我们观察到,使用语言-图像模型计算图像文本匹配分数可以有效地捕捉图像和文本之间的细粒度交互,而这种交互通常在使用余弦相似度时会丢失。与多种先前的放射学报告生成模块相比,X-REM 在自然语言和临床指标方面均表现更好。对生成的报告进行的人类评估表明,与基线检索方法相比,X-REM 增加了零错误报告的数量并减少了平均错误严重性。我们的代码可在以下网站下载:https://github.com/rajpurkarlab/X-REM