This paper discusses the problem of weakly supervised classification, in which instances are given weak labels that are produced by some label-corruption process. The goal is to derive conditions under which loss functions for weak-label learning are proper and lower-bounded -- two essential requirements for the losses used in class-probability estimation. To this end, we derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation. We use this theorem to characterize proper weak-label losses and find a condition for them to be lower-bounded. From these theoretical findings, we derive a novel regularization scheme called generalized logit squeezing, which makes any proper weak-label loss bounded from below, without losing properness. Furthermore, we experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses. The results highlight the importance of properness and lower-boundedness.
翻译:本文讨论了监管不力的分类问题,在这种分类中,某些标签腐败过程产生的标签薄弱,目的是找出一种条件,使标签薄弱学习的损失功能适当和受约束程度较低 -- -- 分类概率估计中所使用的损失的两个基本要求。为此,我们为监督学习中的适当损失得出一个代表理论,这种理论使Savage代表制具有双重性。我们利用这个理论来描述适当的标签薄弱损失特征,并找到一种条件,使其受约束程度较低。根据这些理论发现,我们产生了一种叫作通用的对标签挤压的新的正规化计划,它使任何适当的标签薄弱损失与下面捆绑在一起,而不会失去适当性。此外,我们实验性地展示了我们拟议方法的有效性,与不当或无约束的损失相比,结果突出了正确性和下限的重要性。