In multiple instance multiple label learning, each sample, a bag, consists of multiple instances. To alleviate labeling complexity, each sample is associated with a set of bag-level labels leaving instances within the bag unlabeled. This setting is more convenient and natural for representing complicated objects, which have multiple semantic meanings. Compared to single instance labeling, this approach allows for labeling larger datasets at an equivalent labeling cost. However, for sufficiently large datasets, labeling all bags may become prohibitively costly. Active learning uses an iterative labeling and retraining approach aiming to provide reasonable classification performance using a small number of labeled samples. To our knowledge, only a few works in the area of active learning in the MIML setting are available. These approaches can provide practical solutions to reduce labeling cost but their efficacy remains unclear. In this paper, we propose a novel bag-class pair based approach for active learning in the MIML setting. Due to the partial availability of bag-level labels, we focus on the incomplete-label MIML setting for the proposed active learning approach. Our approach is based on a discriminative graphical model with efficient and exact inference. For the query process, we adapt active learning criteria to the novel bag-class pair selection strategy. Additionally, we introduce an online stochastic gradient descent algorithm to provide an efficient model update after each query. Numerical experiments on benchmark datasets illustrate the robustness of the proposed approach.
翻译:在多个实例的多标签学习中,每个样本,一个包包,由多个实例组成。为了减轻标签的复杂性,每个样本都与一组包级标签挂钩,在包内没有标签的包级标签中留下一些实例。这个设置更方便,更自然,代表复杂的对象,具有多种语义含义。与单例标签相比,这种方法允许以同等标签成本标出更大的数据集。然而,如果足够大的数据集,标记所有袋,标签可能变得过于昂贵。积极学习使用迭代级标签和再培训方法,目的是利用少量标签样本提供合理的分类性工作。据我们了解,只有少数在MIML设置的积极学习领域开展工作。这些方法可以提供实用的解决方案,降低标签成本,但效果仍然不清楚。在本文中,我们建议采用基于新包级配对配对方法,以便在MIML设置的设置中积极学习。由于部分存在袋级标签,我们侧重于拟议的积极学习方法的不完整的标签MIML设置。我们的方法基于一个带有高效和精确的动态排序的图表模型模型模型模型模型,我们提供了一种在线选择。