Training a classifier with noisy labels typically requires the learner to specify the distribution of label noise, which is often unknown in practice. Although there have been some recent attempts to relax that requirement, we show that the Bayes decision rule is unidentified in most classification problems with noisy labels. This suggests it is generally not possible to bypass/relax the requirement. In the special cases in which the Bayes decision rule is identified, we develop a simple algorithm to learn the Bayes decision rule, that does not require knowledge of the noise distribution.
翻译:训练带有嘈杂标签的分类器通常需要学习者指定标签噪声的分布,这在实践中往往是未知的。虽然最近有一些尝试放松这个要求,但我们证明,在大多数带有嘈杂标签的分类问题中,贝叶斯决策规则是未知的。这表明通常不可能绕过/放宽要求。在贝叶斯决策规则被确定的特殊情况下,我们开发了一个简单的算法来学习贝叶斯决策规则,不需要知道噪声分布。