需要根据任务客观地评价基于学习的深耐养方法:在心心肌梗塞背景下进行的一项研究</s> (Need for Objective Task-based Evaluation of Deep Learning-Based Denoising Methods: A Study in the Context of Myocardial Perfusion SPECT)

Artificial intelligence-based methods have generated substantial interest in nuclear medicine. An area of significant interest has been using deep-learning (DL)-based approaches for denoising images acquired with lower doses, shorter acquisition times, or both. Objective evaluation of these approaches is essential for clinical application. DL-based approaches for denoising nuclear-medicine images have typically been evaluated using fidelity-based figures of merit (FoMs) such as RMSE and SSIM. However, these images are acquired for clinical tasks and thus should be evaluated based on their performance in these tasks. Our objectives were to (1) investigate whether evaluation with these FoMs is consistent with objective clinical-task-based evaluation; (2) provide a theoretical analysis for determining the impact of denoising on signal-detection tasks; (3) demonstrate the utility of virtual clinical trials (VCTs) to evaluate DL-based methods. A VCT to evaluate a DL-based method for denoising myocardial perfusion SPECT (MPS) images was conducted. The impact of DL-based denoising was evaluated using fidelity-based FoMs and AUC, which quantified performance on detecting perfusion defects in MPS images as obtained using a model observer with anthropomorphic channels. Based on fidelity-based FoMs, denoising using the considered DL-based method led to significantly superior performance. However, based on ROC analysis, denoising did not improve, and in fact, often degraded detection-task performance. The results motivate the need for objective task-based evaluation of DL-based denoising approaches. Further, this study shows how VCTs provide a mechanism to conduct such evaluations using VCTs. Finally, our theoretical treatment reveals insights into the reasons for the limited performance of the denoising approach.

翻译：以人工为基础的核医学方法引起了人们对核医学的极大兴趣。一个令人十分感兴趣的领域是利用基于深度学习(DL)的方法来消除以较低剂量、更短的获取时间或两者兼而有之的图像。这些方法的客观评估对于临床应用至关重要。基于DL的核医学图像脱钩方法通常使用基于忠诚的优异数字(FOMS)来评价。但是,这些图像是为临床任务而获得的,因此应当根据这些任务的业绩进行评估。我们的目标是:(1) 调查与这些FOMS进行的评价是否与基于临床任务或两者的客观的诊断结果评价相一致;(2) 提供理论分析,以确定对信号检测任务的影响;(3) 展示虚拟临床试验(VCTs)的效用,以评估基于忠诚的优点数字(FOMS)和SS的数值分析方法。利用基于深度的模型的模型进行DLMMS的准确性评估,通过对结果的精确性能进行量化的测试,通过使用基于精确性能的变现方法,通过不断的变现,对基于结果的MAC的精确性变现方法进行深入的成绩评估。</s>