We explore the value of weak labels in learning transferable representations for medical images. Compared to hand-labeled datasets, weak or inexact labels can be acquired in large quantities at significantly lower cost and can provide useful training signals for data-hungry models such as deep neural networks. We consider weak labels in the form of pseudo-labels and propose a semi-weakly supervised contrastive learning (SWCL) framework for representation learning using semi-weakly annotated images. Specifically, we train a semi-supervised model to propagate labels from a small dataset consisting of diverse image-level annotations to a large unlabeled dataset. Using the propagated labels, we generate a patch-level dataset for pretraining and formulate a multi-label contrastive learning objective to capture position-specific features encoded in each patch. We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets, covering three disease classification tasks and two anatomical structure segmentation tasks. Our experiment results suggest that, under very low data regime, large-scale ImageNet pretraining on improved architecture remains a very strong baseline, and recently proposed self-supervised methods falter in segmentation tasks, possibly due to the strong invariant constraint imposed. Our method surpasses all prior self-supervised methods and standard cross-entropy training, while closing the gaps with ImageNet pretraining.
翻译:我们探索了在学习医疗图像可转移的演示中薄弱标签的价值。 与手贴数据集相比,可以大量以低得多的成本获得薄弱或不精确标签,为深神经网络等数据饥饿模型提供有用的培训信号。 我们考虑假标签形式的薄弱标签,并提议一个半微弱监督的对比学习框架,用于使用半微弱的附加说明的图像进行代表学习。 具体地说,我们培训了一个半监管模型,以传播由由多种图像级别说明组成的小数据集到大型无标签数据集的标签。我们使用宣传标签,为预培训制作一个补丁级数据集,并设计一个多标签对比学习目标,以捕捉到每个补补补的定位特点。我们用经验验证了SWCL在7个公共再生基金数据集上的转移学习业绩,涵盖3项疾病分类任务和2项解剖结构分割任务。 我们的实验结果表明,在非常低的数据制度下,大规模图像网络级的图象级说明为大型的图象级级分类,同时在改进之前的系统结构中,最近提出的自我升级之前的自我升级方法仍然非常严格。