Automated radiology report generation aims at automatically generating a detailed description of medical images, which can greatly alleviate the workload of radiologists and provide better medical services to remote areas. Most existing works pay attention to the holistic impression of medical images, failing to utilize important anatomy information. However, in actual clinical practice, radiologists usually locate important anatomical structures, and then look for signs of abnormalities in certain structures and reason the underlying disease. In this paper, we propose a novel framework AGFNet to dynamically fuse the global and anatomy region feature to generate multi-grained radiology report. Firstly, we extract important anatomy region features and global features of input Chest X-ray (CXR). Then, with the region features and the global features as input, our proposed self-adaptive fusion gate module could dynamically fuse multi-granularity information. Finally, the captioning generator generates the radiology reports through multi-granularity features. Experiment results illustrate that our model achieved the state-of-the-art performance on two benchmark datasets including the IU X-Ray and MIMIC-CXR. Further analyses also prove that our model is able to leverage the multi-grained information from radiology images and texts so as to help generate more accurate reports.
翻译:自动放射报告的生成旨在自动生成对医疗图像的详细描述,这将大大减轻放射学家的工作量,并向偏远地区提供更好的医疗服务。大多数现有作品都关注医疗图像的整体印象,没有利用重要的解剖信息。然而,在实际临床实践中,放射学家通常会找到重要的解剖结构,然后寻找某些结构的异常迹象和病原病原因。在本文件中,我们提议了一个新型框架AGFNet,以动态地结合全球和解剖区域特征,生成多片辐射报告。首先,我们提取重要的解剖区域特征和输入X射线(CXR)的全球特征。然后,以区域特征和全球特征作为投入,我们提议的自我适应聚变门模块可以动态地连接多感光性信息。最后,字幕生成器通过多感光度特征生成放射报告。实验结果表明,我们的模型在两个基准数据集上取得了状态的状态性表现,包括IU X射线和MICX光光谱(C-R) 输入的准确性图像分析,从而进一步证明我们从模型到多感应变的图像分析能够产生更多的图像。