Few-shot learning is an established topic in natural images for years, but few work is attended to histology images, which is of high clinical value since well-labeled datasets and rare abnormal samples are expensive to collect. Here, we facilitate the study of few-shot learning in histology images by setting up three cross-domain tasks that simulate real clinics problems. To enable label-efficient learning and better generalizability, we propose to incorporate contrastive learning (CL) with latent augmentation (LA) to build a few-shot system. CL learns useful representations without manual labels, while LA transfers semantic variations of the base dataset in an unsupervised way. These two components fully exploit unlabeled training data and can scale gracefully to other label-hungry problems. In experiments, we find i) models learned by CL generalize better than supervised learning for histology images in unseen classes, and ii) LA brings consistent gains over baselines. Prior studies of self-supervised learning mainly focus on ImageNet-like images, which only present a dominant object in their centers. Recent attention has been paid to images with multi-objects and multi-textures. Histology images are a natural choice for such a study. We show the superiority of CL over supervised learning in terms of generalization for such data and provide our empirical understanding for this observation. The findings in this work could contribute to understanding how the model generalizes in the context of both representation learning and histological image analysis. Code is available.
翻译:少见的学习是多年来自然图像中的一个既定主题,但很少有人会研究具有高临床价值的病理学图像,因为有标签的数据集和罕见的异常样本收集费用昂贵。在这里,我们通过设置三个模拟真实诊所问题的跨领域任务,为研究神学图像中少见的学习提供了便利。为了能够进行标签高效的学习和更普遍化的学习,我们提议将具有潜伏增强(LA)的对比性学习(CL)纳入一个少见的系统。CL学会了有用的演示,没有手工标签,而LA以不受监督的方式传输基础数据集的语义变异。这两个组成部分充分利用了未贴标签的培训数据,并能够将精细地推广到其他标签饥饿问题。在实验中,我们发现CLL(i)所学的模式比监督地在隐蔽的类中学习要好,以及L(ii)LA(L)在基线上带来一致的收益。先前的自我超强的学习研究主要侧重于图像网络图像,而这仅仅是其中心的主要观察对象。最近的关注焦点是利用无标签的图像的图象学研究,在多级分析中,我们的研究为了对历史的图象学的理论的理论的学习提供了一种分析。