Learning from label proportions (LLP) aims at learning an instance-level classifier with label proportions in grouped training data. Existing deep learning based LLP methods utilize end-to-end pipelines to obtain the proportional loss with Kullback-Leibler divergence between the bag-level prior and posterior class distributions. However, the unconstrained optimization on this objective can hardly reach a solution in accordance with the given proportions. Besides, concerning the probabilistic classifier, this strategy unavoidably results in high-entropy conditional class distributions at the instance level. These issues further degrade the performance of the instance-level classification. In this paper, we regard these problems as noisy pseudo labeling, and instead impose the strict proportion consistency on the classifier with a constrained optimization as a continuous training stage for existing LLP classifiers. In addition, we introduce the mixup strategy and symmetric crossentropy to further reduce the label noise. Our framework is model-agnostic, and demonstrates compelling performance improvement in extensive experiments, when incorporated into other deep LLP models as a post-hoc phase.
翻译:从标签比例(LLP)中学习,目的是学习一个在分组培训数据中带有标签比例的试例级分类师; 现有的深层次学习基础的LLP方法利用端到端管道,利用Kullback-Leebler在包级前级和后级分配之间的差数获得相应损失; 然而,对这一目标的未受限制的优化很难按照给定比例找到解决办法; 此外,关于概率分类师,这一战略不可避免地导致在试例一级高渗透性有条件等级分布。 这些问题进一步降低了试例级分类的性能。 在本文中,我们认为这些问题是吵闹的假标签,而是将严格比例一致性和限制优化作为现有LLP分类师的持续培训阶段。 此外,我们引入混合战略和对称交叉性交叉性交叉性,以进一步减少标签噪音。 我们的框架是模型-, 并表明,当纳入其他深层次LP模型后阶段时,在广泛实验中,我们的框架具有令人信服的性改进性能。