Medical imaging plays a pivotal role in diagnosis and treatment in clinical practice. Inspired by the significant progress in automatic image captioning, various deep learning (DL)-based architectures have been proposed for generating radiology reports for medical images. However, model uncertainty (i.e., model reliability/confidence on report generation) is still an under-explored problem. In this paper, we propose a novel method to explicitly quantify both the visual uncertainty and the textual uncertainty for the task of radiology report generation. Such multi-modal uncertainties can sufficiently capture the model confidence scores at both the report-level and the sentence-level, and thus they are further leveraged to weight the losses for achieving more comprehensive model optimization. Our experimental results have demonstrated that our proposed method for model uncertainty characterization and estimation can provide more reliable confidence scores for radiology report generation, and our proposed uncertainty-weighted losses can achieve more comprehensive model optimization and result in state-of-the-art performance on a public radiology report dataset.
翻译:医学成像在临床实践的诊断和治疗中发挥着关键作用。在自动图像说明的重大进展的启发下,提出了各种基于深层次学习(DL)的架构,为医疗图像制作放射学报告;然而,模型不确定性(即模型可靠性/对生成报告的信心)仍然是一个未得到充分探讨的问题。在本文中,我们提出了一种新颖的方法,以明确量化可视不确定性和生成放射学报告任务的文字不确定性。这种多模式不确定性能够充分捕捉报告层面和判决层面的模型信心得分,从而进一步被利用来权衡损失,以实现更全面的模型优化。我们的实验结果表明,我们拟议的模型不确定性定性和估算方法可以为生成放射学报告提供更可靠的信心分数,我们拟议的不确定性加权损失可以实现更全面的模型优化,并导致公共放射学报告数据集的状态表现。