Domain shift refers to the difference in the data distribution of two datasets, normally between the training set and the test set for machine learning algorithms. Domain shift is a serious problem for generalization of machine learning models and it is well-established that a domain shift between the training and test sets may cause a drastic drop in the model's performance. In medical imaging, there can be many sources of domain shift such as different scanners or scan protocols, different pathologies in the patient population, anatomical differences in the patient population (e.g. men vs women) etc. Therefore, in order to train models that have good generalization performance, it is important to be aware of the domain shift problem, its potential causes and to devise ways to address it. In this paper, we study the effect of domain shift on left and right ventricle blood pool segmentation in short axis cardiac MR images. Our dataset contains short axis images from 4 different MR scanners and 3 different pathology groups. The training is performed with nnUNet. The results show that scanner differences cause a greater drop in performance compared to changing the pathology group, and that the impact of domain shift is greater on right ventricle segmentation compared to left ventricle segmentation. Increasing the number of training subjects increased cross-scanner performance more than in-scanner performance at small training set sizes, but this difference in improvement decreased with larger training set sizes. Training models using data from multiple scanners improved cross-domain performance.
翻译:域变是指两个数据集的数据分布差异,通常在培训组和机器学习算法测试组之间。 域变是机器学习模型普遍化的严重问题, 并且它早已确定, 训练组和测试组之间的域变可能会导致模型性能急剧下降。 在医学成像中, 领域变换有许多来源, 如不同的扫描器或扫描程序、 病人群中的不同病理、 病人群中的解剖差异( 如男性对女性) 等。 因此, 为了培训具有良好通用性能的模型, 域变换对于机器学习模型的概括性能是一个严重问题, 重要的是要了解域变问题、 其潜在原因和设计解决问题的方法。 在本文中, 我们研究左轴和右心室血库转换的影响。 我们的数据集包含来自4个不同的MR扫描器和3个不同病理组的短轴图像。 培训是用 nnUNet 进行的。 结果显示, 扫描仪差异导致性绩下降幅度大于在使用更精确化的轨变换路径组, 培训组中, 培训分数将影响范围变大。