Machine learning algorithms underpin modern diagnostic-aiding software, which has proved valuable in clinical practice, particularly in radiology. However, inaccuracies, mainly due to the limited availability of clinical samples for training these algorithms, hamper their wider applicability, acceptance, and recognition amongst clinicians. We present an analysis of state-of-the-art automatic quality control (QC) approaches that can be implemented within these algorithms to estimate the certainty of their outputs. We validated the most promising approaches on a brain image segmentation task identifying white matter hyperintensities (WMH) in magnetic resonance imaging data. WMH are a correlate of small vessel disease common in mid-to-late adulthood and are particularly challenging to segment due to their varied size, and distributional patterns. Our results show that the aggregation of uncertainty and Dice prediction were most effective in failure detection for this task. Both methods independently improved mean Dice from 0.82 to 0.84. Our work reveals how QC methods can help to detect failed segmentation cases and therefore make automatic segmentation more reliable and suitable for clinical practice.
翻译:现代诊断辅助软件的诊断辅助算法是现代诊断辅助软件的基础,在临床实践中,特别是在放射学中,这种算法被证明很有价值。然而,主要由于培训这些算法的临床样本数量有限,因此不准确,妨碍了这些算法的广泛应用、接受和在临床医生中间的承认。我们分析了在这些算法中可以采用的最新的自动质量控制(QC)方法,以估计其产出的确定性。我们验证了在大脑图像分割任务中最有希望的方法,确定磁共振成像数据中的白物质高密度。WMH是成年中期常见的小型船只疾病的一个相关因素,并且由于其规模和分布模式各异,对分块尤其具有挑战性。我们的结果显示,不确定性和狄氏预测的汇总在发现这一任务失败方面最为有效。这两种方法都得到独立改进,意味着Dice从0.82到0.84。我们的工作揭示了QC方法如何帮助检测失败的分解案例,从而使自动分解法更加可靠和适合临床实践。