Labelling of data for supervised learning can be costly and time-consuming and the risk of incorporating label noise in large data sets is imminent. If training a flexible discriminative model using a strictly proper loss, such noise will inevitably shift the solution towards the conditional distribution over noisy labels. Nevertheless, while deep neural networks have proved capable of fitting random labels, regularisation and the use of robust loss functions empirically mitigate the effects of label noise. However, such observations concern robustness in accuracy, which is insufficient if reliable uncertainty quantification is critical. We demonstrate this by analysing the properties of the conditional distribution over noisy labels for an input-dependent noise model. In addition, we evaluate the set of robust loss functions characterised by an overlap in asymptotic risk minimisers under the clean and noisy data distributions. We find that strictly proper and robust loss functions both offer asymptotic robustness in accuracy, but neither guarantee that the resulting model is calibrated. Moreover, overfitting is an issue in practice. With these results, we aim to explain inherent robustness of algorithms to label noise and to give guidance in the development of new noise-robust algorithms.
翻译:监督学习的标签数据可能成本高,耗时费时,而且将标签噪音纳入大型数据集的风险迫在眉睫。如果用严格适当的损失来培训一个灵活的歧视性模型,这种噪音势必将解决方案转向对噪音标签的有条件分配。然而,尽管深神经网络已证明能够随机安装标签、规范化和使用稳健的损失功能,从经验上减轻标签噪音的影响。然而,这种观察涉及准确性强,如果可靠的不确定性量化至关重要,这是不够的。我们通过分析一个基于投入的噪音模型,对噪音标签进行有条件分配的特性来证明这一点。此外,我们评估一套稳健的损失功能,其特点是在清洁和吵闹的数据分布下,在无症状风险最小化器中出现重叠。我们发现,严格适当和稳健的损失功能既提供了无症状的稳健性准确性,但不能保证由此产生的模型得到校准。此外,过度调整是一个实际问题。根据这些结果,我们旨在解释用于标签噪音的算法内在稳健性,并指导制定新的噪音-气压算法。