Semi-supervised learning (SSL) uses unlabeled data during training to learn better models. Previous studies on SSL for medical image segmentation focused mostly on improving model generalization to unseen data. In some applications, however, our primary interest is not generalization but to obtain optimal predictions on a specific unlabeled database that is fully available during model development. Examples include population studies for extracting imaging phenotypes. This work investigates an often overlooked aspect of SSL, transduction. It focuses on the quality of predictions made on the unlabeled data of interest when they are included for optimization during training, rather than improving generalization. We focus on the self-training framework and explore its potential for transduction. We analyze it through the lens of Information Gain and reveal that learning benefits from the use of calibrated or under-confident models. Our extensive experiments on a large MRI database for multi-class segmentation of traumatic brain lesions shows promising results when comparing transductive with inductive predictions. We believe this study will inspire further research on transductive learning, a well-suited paradigm for medical image analysis.
 翻译:在培训过程中,半监督学习(SSL)使用未贴标签的数据来学习更好的模型。以前关于医学图像分割的SSL研究主要侧重于改进对无形数据的模型概括化。然而,在某些应用中,我们的主要兴趣不是概括化,而是对在模型开发期间完全可用的特定未贴标签数据库进行最佳预测。例子包括用于提取成像phenostype 的人口研究。这项工作调查了SSL经常被忽视的一个方面,即移植。它侧重于在培训期间将未贴标签的感兴趣数据列入优化而非改进一般化时,这些数据的预测质量。我们侧重于自我培训框架,并探索其转换的潜力。我们通过信息增益透镜分析它,并揭示从使用校准或不完全自信模型中学习的好处。我们在用于创伤性脑损伤损伤的多级分解的大型 MRI 数据库进行的广泛实验表明,在将转基因与感测预测进行比较时,结果很有希望得到好。我们认为,这项研究将激发对转基因学习的进一步研究,这是医学图像分析的恰当范例。