Pre-training a recognition model with contrastive learning on a large dataset of unlabeled data has shown great potential to boost the performance of a downstream task, e.g., image classification. However, in domains such as medical imaging, collecting unlabeled data can be challenging and expensive. In this work, we propose to adapt contrastive learning to work with meta-label annotations, for improving the model's performance in medical image segmentation even when no additional unlabeled data is available. Meta-labels such as the location of a 2D slice in a 3D MRI scan or the type of device used, often come for free during the acquisition process. We use the meta-labels for pre-training the image encoder as well as to regularize a semi-supervised training, in which a reduced set of annotated data is used for training. Finally, to fully exploit the weak annotations, a self-paced learning approach is used to help the learning and discriminate useful labels from noise. Results on three different medical image segmentation datasets show that our approach: i) highly boosts the performance of a model trained on a few scans, ii) outperforms previous contrastive and semi-supervised approaches, and iii) reaches close to the performance of a model trained on the full data.
翻译:培训前的识别模型,在大型未贴标签数据数据集上进行对比性学习,显示极有可能提高下游任务(例如图像分类)的性能。然而,在医学成像学等领域,收集未贴标签数据既具有挑战性又昂贵。在这项工作中,我们建议调整对比性学习,使之与元标签说明相结合,即使没有额外的未贴标签数据,也能改善模型在医学图像分割方面的性能。元标签,如3D MRI扫描中的2D切片的位置或所使用的设备类型,通常在获取过程中免费提供。我们使用元标签对图像编码器进行预培训,并规范半监督性培训,在培训中使用数量较少的附加说明数据集。最后,为了充分利用薄弱的注释,采用了自制进度的学习方法来帮助学习和区分噪音中的有用标签。 三种不同的医学图像分割数据集的结果显示,我们的方法是:一) 大力提升经过培训的模型的性能,在少数次高超前扫描、经过训练的半精确性能方面,在前几部经过全面扫描的模型上达到。