This paper discusses the problem of weakly supervised learning of classification, in which instances are given weak labels that are produced by some label-corruption process. The goal is to derive conditions under which loss functions for weak-label learning are proper and lower-bounded -- two essential requirements for the losses used in class-probability estimation. To this end, we derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation. We use this theorem to characterize proper weak-label losses and find a condition for them to be lower-bounded. Based on these theoretical findings, we derive a novel regularization scheme called generalized logit squeezing, which makes any proper weak-label loss bounded from below, without losing properness. Furthermore, we experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses. Those results highlight the importance of properness and lower-boundedness. The code is publicly available at https://github.com/yoshum/lower-bounded-proper-losses.
翻译:本文讨论了监管不力的分类学习问题,在这种学习中,某些标签腐败过程产生的标签薄弱,目的是找出一种条件,使标签薄弱学习的损失功能适当和受限制程度较低 -- -- 分类概率估计中所使用的损失的两个基本要求。为此,我们为监督学习中的适当损失得出一个代表理论,这种理论使Savage代表结构具有双重性。我们利用这个理论来描述适当的标签薄弱损失,并找出降低损失范围的条件。根据这些理论发现,我们产生了一种叫作通用的登录挤压的新型规范化计划,它使任何适当的标签薄弱损失都与下面捆绑在一起,而不会失去正确性。此外,我们实验性地展示了我们拟议方法的有效性,与不当或无约束的损失相比。这些结果突出了适当性和低约束性的重要性。代码可在https://github.com/yoshum/lower-bound-proper-losseslests。