Labelling of data for supervised learning can be costly and time-consuming and the risk of incorporating label noise in large data sets is imminent. When training a flexible discriminative model using a strictly proper loss, such noise will inevitably shift the solution towards the conditional distribution over noisy labels. Nevertheless, while deep neural networks have proven capable of fitting random labels, regularisation and the use of robust loss functions empirically mitigate the effects of label noise. However, such observations concern robustness in accuracy, which is insufficient if reliable uncertainty quantification is critical. We demonstrate this by analysing the properties of the conditional distribution over noisy labels for an input-dependent noise model. In addition, we evaluate the set of robust loss functions characterised by noise-insensitive, asymptotic risk minimisers. We find that strictly proper and robust loss functions both offer asymptotic robustness in accuracy, but neither guarantee that the final model is calibrated. Moreover, even with robust loss functions, overfitting is an issue in practice. With these results, we aim to explain observed robustness of common training practices, such as early stopping, to label noise. In addition, we aim to encourage the development of new noise-robust algorithms that not only preserve accuracy but that also ensure reliability.
翻译:监督学习的标签数据可能成本高,耗时费时,将标签噪音纳入大型数据集的风险也迫在眉睫。在培训使用严格适当的损失的灵活歧视模式时,这种噪音不可避免地会将解决方案转向对噪音标签的有条件分配。然而,尽管深神经网络已证明能够安装随机标签、规范化和使用稳健的损失功能,但从经验上可以减轻标签噪音的影响。然而,这种观察还涉及准确性强,如果可靠的不确定性量化至关重要,这种准确性是不够的。我们通过分析一个基于投入的噪音模型,对噪音标签进行有条件分配的特性来证明这一点。此外,我们评估一套以对噪音敏感、无症状风险最小性为特征的稳健健损失功能。我们发现,严格正确和稳健的损失功能既能提供无症状的稳健性稳健性,但却不能保证最终模型得到校准。此外,即使存在稳健的损失功能,但安装过强性是实践中的一个问题。根据这些结果,我们旨在解释常见培训做法的稳健性,例如及早停止使用标签噪音。此外,我们还打算鼓励发展新的准确性,但只能确保新的噪动算。