Current artificial intelligence (AI) algorithms for short-axis cardiac magnetic resonance (CMR) segmentation achieve human performance for slices situated in the middle of the heart. However, an often-overlooked fact is that segmentation of the basal and apical slices is more difficult. During manual analysis, differences in the basal segmentations have been reported as one of the major sources of disagreement in human interobserver variability. In this work, we aim to investigate the performance of AI algorithms in segmenting basal and apical slices and design strategies to improve their segmentation. We trained all our models on a large dataset of clinical CMR studies obtained from two NHS hospitals (n=4,228) and evaluated them against two external datasets: ACDC (n=100) and M&Ms (n=321). Using manual segmentations as a reference, CMR slices were assigned to one of four regions: non-cardiac, base, middle, and apex. Using the nnU-Net framework as a baseline, we investigated two different approaches to reduce the segmentation performance gap between cardiac regions: (1) non-uniform batch sampling, which allows us to choose how often images from different regions are seen during training; and (2) a cardiac-region classification model followed by three (i.e. base, middle, and apex) region-specific segmentation models. We show that the classification and segmentation approach was best at reducing the performance gap across all datasets. We also show that improvements in the classification performance can subsequently lead to a significantly better performance in the segmentation task.
翻译:目前短轴心磁共振(CMR)断裂的人工智能(AI)算法实现了心脏中间切片的人类性能。然而,一个经常被忽视的事实是,玄武和正皮切片的分解更为困难。在人工分析中,巴沙分离的差别被报告为人类观察者之间差异的一个主要来源。在这项工作中,我们的目标是调查人工算法在割裂腹骨和正皮切片方面的性能,并设计战略以改善其分化。我们培训了我们的所有模型,在从两家NHS医院(n=4,228)获得的大规模CMMR临床分解研究数据,并用两个外部数据集(ACDC(n=100)和M&Ms(n=321))进行了评估。在人类观察者之间,将CMRM切片分配到四个区域中的一个区域:非心电图、基、基底、中间和顶部。利用NU-Net框架作为基线,我们调查了两种不同的模型,以缩小分层分层分解的临床分解方法,然后在取样期间显示不同区域的业绩差距。我们从三个区域看到,我们从不同层次的分级分析的分级分析结果显示不同区域如何。我们从三个区域如何显示。