We consider the problem of Learning from Label Proportions (LLP), a weakly supervised classification setup where instances are grouped into "bags", and only the frequency of class labels at each bag is available. Albeit, the objective of the learner is to achieve low task loss at an individual instance level. Here we propose Easyllp: a flexible and simple-to-implement debiasing approach based on aggregate labels, which operates on arbitrary loss functions. Our technique allows us to accurately estimate the expected loss of an arbitrary model at an individual level. We showcase the flexibility of our approach by applying it to popular learning frameworks, like Empirical Risk Minimization (ERM) and Stochastic Gradient Descent (SGD) with provable guarantees on instance level performance. More concretely, we exhibit a variance reduction technique that makes the quality of LLP learning deteriorate only by a factor of k (k being bag size) in both ERM and SGD setups, as compared to full supervision. Finally, we validate our theoretical results on multiple datasets demonstrating our algorithm performs as well or better than previous LLP approaches in spite of its simplicity.
翻译:我们考虑了Label比例学习(LLP)的问题,这是一个监管不力的分类机制,将各种情况归为“包”,而且每个包中只有分类标签的频率。尽管学习者的目标是在单个实例一级实现低任务损失。我们在这里建议 Easilp:基于总标签的灵活和简单到执行的消除偏见方法,该方法以任意损失功能为基础。我们的技术使我们能够准确估计个人一级任意模型的预期损失。我们展示了我们方法的灵活性,将它应用到大众学习框架,例如:经验风险最小化和斯托切基因梯(Socketical Eright Emplement),在实例级表现上有可察觉的保证。更具体地说,我们展示了一种降低差异的方法,使LLP学习质量恶化,仅因为机构风险管理和SGD设置中的k(k为袋大小)系数,而不是全面监督。最后,我们验证了我们关于多种数据集的理论结果,表明我们的算法表现优于以前的LP方法,尽管其简单。