We study the problem of learning from aggregate observations where supervision signals are given to sets of instances instead of individual instances, while the goal is still to predict labels of unseen individuals. A well-known example is multiple instance learning (MIL). In this paper, we extend MIL beyond binary classification to other problems such as multiclass classification and regression. We present a general probabilistic framework that accommodates a variety of aggregate observations, e.g., pairwise similarity/triplet comparison for classification and mean/difference/rank observation for regression. Simple maximum likelihood solutions can be applied to various differentiable models such as deep neural networks and gradient boosting machines. Moreover, we develop the concept of consistency up to an equivalence relation to characterize our estimator and show that it has nice convergence properties under mild assumptions. Experiments on three problem settings -- classification via triplet comparison and regression via mean/rank observation indicate the effectiveness of the proposed method.
翻译:我们研究从综合观察中学习的问题,在综合观察中,监督信号是按几组情况而不是个别情况提供的,而目标仍然是预测隐形个人的标签。一个众所周知的例子就是多实例学习(MIL)。在本文中,我们将MIL从二进制分类扩大到其它问题,例如多级分类和回归。我们提出了一个一般的概率框架,考虑到各种综合观察,例如,对分类进行双向相似性/三进制比较,对回归进行平均/偏差/排序观察。对于各种不同的模型,例如深神经网络和梯度加速机,可以应用简单的最大可能性解决办法。此外,我们发展一致性概念,以达到等同关系的概念来描述我们的天主的特征,并表明它在温和假设下具有良好的趋同性。在三个问题设置上进行实验 -- -- 通过中/级观察进行三进比较和回归的分类,表明拟议方法的有效性。