Recently, chest X-ray report generation, which aims to automatically generate descriptions of given chest X-ray images, has received growing research interests. The key challenge of chest X-ray report generation is to accurately capture and describe the abnormal regions. In most cases, the normal regions dominate the entire chest X-ray image, and the corresponding descriptions of these normal regions dominate the final report. Due to such data bias, learning-based models may fail to attend to abnormal regions. In this work, to effectively capture and describe abnormal regions, we propose the Contrastive Attention (CA) model. Instead of solely focusing on the current input image, the CA model compares the current input image with normal images to distill the contrastive information. The acquired contrastive information can better represent the visual features of abnormal regions. According to the experiments on the public IU-X-ray and MIMIC-CXR datasets, incorporating our CA into several existing models can boost their performance across most metrics. In addition, according to the analysis, the CA model can help existing models better attend to the abnormal regions and provide more accurate descriptions which are crucial for an interpretable diagnosis. Specifically, we achieve the state-of-the-art results on the two public datasets.
翻译:最近,自动化胸透报告生成受到了越来越多的研究关注,旨在自动生成所给定胸透图像的描述。胸透报告生成的主要挑战在于准确捕捉和描述异常区域。在大多数情况下,正常区域占据了整个胸透图像,并且这些正常区域的描述占据了最终报告的主导地位。由于这种数据偏差,学习模型可能无法关注异常区域。在本研究中,我们提出了对比注意力(CA)模型,以有效地捕捉和描述异常区域。CA模型不仅关注当前的输入图像,而且还将当前输入图像与正常图像进行比较,以提取对比信息。获得的对比信息可以更好地表示异常区域的视觉特征。根据对公共IU-X射线和MIMIC-CXR数据集的实验证明,将我们的CA融入到几种现有模型中可以在大多数指标上提高它们的性能。此外,根据分析,CA模型可以帮助现有模型更好地关注异常区域,并提供更准确的描述,这对于诊断具有解释性的情况至关重要。具体来说,我们在这两个公共数据集上取得了最先进的结果。