In Multiple Instance learning (MIL), weak labels are provided at the bag level with only presence/absence information known. However, there is a considerable gap in performance in comparison to a fully supervised model, limiting the practical applicability of MIL approaches. Thus, this paper introduces a novel semi-weak label learning paradigm as a middle ground to mitigate the problem. We define semi-weak label data as data where we know the presence or absence of a given class and the exact count of each class as opposed to knowing the label proportions. We then propose a two-stage framework to address the problem of learning from semi-weak labels. It leverages the fact that counting information is non-negative and discrete. Experiments are conducted on generated samples from CIFAR-10. We compare our model with a fully-supervised setting baseline, a weakly-supervised setting baseline and learning from pro-portion (LLP) baseline. Our framework not only outperforms both baseline models for MIL-based weakly super-vised setting and learning from proportion setting, but also gives comparable results compared to the fully supervised model. Further, we conduct thorough ablation studies to analyze across datasets and variation with batch size, losses architectural changes, bag size and regularization
翻译:在多例学习(MIL)中,在包层一级提供薄弱的标签,只提供已知的存在/缺失信息;然而,与完全监督的模式相比,业绩差距很大,限制了MIL方法的实际适用性,因此,本文件引入了一种新的半弱标签学习模式,作为缓解问题的中间基础;我们将半弱标签数据定义为我们知道某一类的存在或不存在和每个类的确切计数而不是了解标签比例的数据;我们然后提出一个两个阶段的框架,以解决从半弱标签中学习的问题;它利用一个事实,即信息计数是非负性和离散性的;对CIFAR-10产生的样本进行实验;我们将我们的模型与一个完全监督的设定基线、一个薄弱监督的设定基线和从准移植(LLLP)基线学习的数据进行比较;我们的框架不仅优于MIL基础薄弱的超弱检查设置和从比例设定中学习的基线模型,而且与完全监督的模型进行比较的结果。我们还在CIFAR-10产生的样品上进行实验;我们将我们的模型与一个完全监督的模型进行比较,我们进行彻底的建筑规模变化,并分析结构规模的变换成。