Medical imaging plays a pivotal role in diagnosis and treatment in clinical practice. Inspired by the significant progress in automatic image captioning, various deep learning (DL)-based methods have been proposed to generate radiology reports for medical images. Despite promising results, previous works overlook the uncertainties of their models and are thus unable to provide clinicians with the reliability/confidence of the generated radiology reports to assist their decision-making. In this paper, we propose a novel method to explicitly quantify both the visual uncertainty and the textual uncertainty for DL-based radiology report generation. Such multi-modal uncertainties can sufficiently capture the model confidence degree at both the report level and the sentence level, and thus they are further leveraged to weight the losses for more comprehensive model optimization. Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public radiology report datasets. In addition, the quality of the automatically generated reports was manually evaluated by human raters and the results also indicate that the proposed uncertainties can reflect the variance of clinical diagnosis.
翻译:医学成像在临床实践的诊断和治疗中发挥着关键作用。在自动图像说明的重大进展的启发下,提出了各种基于深层次学习(DL)的方法,为医疗图像生成放射学报告。尽管取得了令人乐观的成果,但先前的工作忽略了模型的不确定性,因此无法为临床医生提供生成的放射学报告的可靠性/信心,以协助其决策。在本文件中,我们提出了一个新颖的方法,以明确量化基于DL的放射学报告的视觉不确定性和文字不确定性。这种多模式的不确定性可以充分捕捉报告层面和句子层面的模型信任度,从而进一步利用这些方法来权衡损失,以进行更全面的模型优化。实验结果表明,拟议的模型不确定性定性和估算方法能够为生成放射学报告提供更可靠的信心分数,以及考虑到不确定性的修改后的损失功能,使两个公共放射学报告数据集的模型性表现更好。此外,自动生成的报告的质量由人手动评估,结果还表明,拟议的不确定性可以反映临床诊断的差异。