Learning representations for individual instances when only bag-level labels are available is a fundamental challenge in multiple instance learning (MIL). Recent works have shown promising results using contrastive self-supervised learning (CSSL), which learns to push apart representations corresponding to two different randomly-selected instances. Unfortunately, in real-world applications such as medical image classification, there is often class imbalance, so randomly-selected instances mostly belong to the same majority class, which precludes CSSL from learning inter-class differences. To address this issue, we propose a novel framework, Iterative Self-paced Supervised Contrastive Learning for MIL Representations (ItS2CLR), which improves the learned representation by exploiting instance-level pseudo labels derived from the bag-level labels. The framework employs a novel self-paced sampling strategy to ensure the accuracy of pseudo labels. We evaluate ItS2CLR on three medical datasets, showing that it improves the quality of instance-level pseudo labels and representations, and outperforms existing MIL methods in terms of both bag and instance level accuracy.
翻译:在只有袋级标签的情况下,个别案例的学习表现是多个实例学习(MIL)的一个基本挑战。最近的工作显示,使用对比式自我监督学习(CSSL)取得了有希望的成果,后者学会将两种不同随机选择的事例分开。不幸的是,在医学图像分类等现实世界应用中,往往存在阶级不平衡,因此随机选择的事例大多属于同一个多数类,这使得CSSL无法了解不同类别。为了解决这个问题,我们提出了一个新颖的框架,即MIL代表的循环性自我控制超级反比学习(ITS2CLR),它通过利用从包级标签中得出的实例级假标签来改进学习的代表性。这个框架采用了一种新的自我进度抽样战略,以确保伪标签的准确性。我们在三个医疗数据集上对ISS2CLR进行了评估,表明它提高了实例级伪标签和表述的质量,在包级和实例级准确性方面都超过了现有的MIL方法。