使用噪音标签培训时的强力和可靠性 (Robustness and Reliability When Training With Noisy Labels) - 专知论文

会员服务 ·

0

稳健性 · 损失函数（机器学习） · 噪声 · 标注 · 模型评估 ·

2022 年 5 月 12 日

Robustness and Reliability When Training With Noisy Labels

翻译：使用噪音标签培训时的强力和可靠性

Amanda Olmin,Fredrik Lindsten

from arxiv, Accepted at AISTATS 2022

Labelling of data for supervised learning can be costly and time-consuming and the risk of incorporating label noise in large data sets is imminent. When training a flexible discriminative model using a strictly proper loss, such noise will inevitably shift the solution towards the conditional distribution over noisy labels. Nevertheless, while deep neural networks have proven capable of fitting random labels, regularisation and the use of robust loss functions empirically mitigate the effects of label noise. However, such observations concern robustness in accuracy, which is insufficient if reliable uncertainty quantification is critical. We demonstrate this by analysing the properties of the conditional distribution over noisy labels for an input-dependent noise model. In addition, we evaluate the set of robust loss functions characterised by noise-insensitive, asymptotic risk minimisers. We find that strictly proper and robust loss functions both offer asymptotic robustness in accuracy, but neither guarantee that the final model is calibrated. Moreover, even with robust loss functions, overfitting is an issue in practice. With these results, we aim to explain observed robustness of common training practices, such as early stopping, to label noise. In addition, we aim to encourage the development of new noise-robust algorithms that not only preserve accuracy but that also ensure reliability.

翻译：监督学习的标签数据可能成本高,耗时费时,将标签噪音纳入大型数据集的风险也迫在眉睫。在培训使用严格适当的损失的灵活歧视模式时,这种噪音不可避免地会将解决方案转向对噪音标签的有条件分配。然而,尽管深神经网络已证明能够安装随机标签、规范化和使用稳健的损失功能,但从经验上可以减轻标签噪音的影响。然而,这种观察还涉及准确性强,如果可靠的不确定性量化至关重要,这种准确性是不够的。我们通过分析一个基于投入的噪音模型,对噪音标签进行有条件分配的特性来证明这一点。此外,我们评估一套以对噪音敏感、无症状风险最小性为特征的稳健健损失功能。我们发现,严格正确和稳健的损失功能既能提供无症状的稳健性稳健性,但却不能保证最终模型得到校准。此外,即使存在稳健的损失功能,但安装过强性是实践中的一个问题。根据这些结果,我们旨在解释常见培训做法的稳健性,例如及早停止使用标签噪音。此外,我们还打算鼓励发展新的准确性,但只能确保新的噪动算。

0

相关内容

稳健性

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

时滞发展方程的行波解及噪声扰动

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

急性白血病细胞中TSC2异常表达对mTORC1通路活性及白血病细胞生物学的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于电容层析成像的CO2咸水层封存动态运移规律的测量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层梯度多场耦合纳米复合材料的性能分析及优化设计

国家自然科学基金

0+阅读 · 2011年12月31日

REMg2TMx型多相合金的吸/放氢行为和衰减机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

Data Banzhaf: A Data Valuation Framework with Maximal Robustness to Learning Stochasticity

Data Banzhaf: A Data Valuation Framework with Maximal Robustness to Learning Stochasticity

Arxiv

0+阅读 · 2022年7月1日

The Bandwagon Effect: Not Just Another Bias

Arxiv

0+阅读 · 2022年7月1日

Robust Bayesian Learning for Reliable Wireless AI: Framework and Applications

Arxiv

0+阅读 · 2022年7月1日

On the Impact of Noises in Crowd-Sourced Data for Speech Translation

Arxiv

0+阅读 · 2022年7月1日

Understanding Instance-Level Impact of Fairness Constraints

Arxiv

0+阅读 · 2022年6月30日

Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning

Arxiv

0+阅读 · 2022年6月30日

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

Arxiv

0+阅读 · 2022年6月30日

On Non-Random Missing Labels in Semi-Supervised Learning

Arxiv

0+阅读 · 2022年6月29日

IBP Regularization for Verified Adversarial Robustness via Branch-and-Bound

Arxiv

0+阅读 · 2022年6月29日

Few-shot Learning with Noisy Labels

Arxiv

13+阅读 · 2022年4月12日

VIP会员

文章信息

相关主题

损失函数（机器学习）

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《面向无人机集群的避障动态传感器覆盖算法》最新38页

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Data Banzhaf: A Data Valuation Framework with Maximal Robustness to Learning Stochasticity

Data Banzhaf: A Data Valuation Framework with Maximal Robustness to Learning Stochasticity

Arxiv

0+阅读 · 2022年7月1日

The Bandwagon Effect: Not Just Another Bias

Arxiv

0+阅读 · 2022年7月1日

Robust Bayesian Learning for Reliable Wireless AI: Framework and Applications

Arxiv

0+阅读 · 2022年7月1日

On the Impact of Noises in Crowd-Sourced Data for Speech Translation

Arxiv

0+阅读 · 2022年7月1日

Understanding Instance-Level Impact of Fairness Constraints

Arxiv

0+阅读 · 2022年6月30日

Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning

Arxiv

0+阅读 · 2022年6月30日

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

Arxiv

0+阅读 · 2022年6月30日

On Non-Random Missing Labels in Semi-Supervised Learning

Arxiv

0+阅读 · 2022年6月29日

IBP Regularization for Verified Adversarial Robustness via Branch-and-Bound

Arxiv

0+阅读 · 2022年6月29日

Few-shot Learning with Noisy Labels

Arxiv

13+阅读 · 2022年4月12日

相关基金

Alpha稳定分布环境下的非圆信号波达方向估计方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向高光谱遥感成像的空谱三维压缩感知方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

时滞发展方程的行波解及噪声扰动

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

急性白血病细胞中TSC2异常表达对mTORC1通路活性及白血病细胞生物学的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于电容层析成像的CO2咸水层封存动态运移规律的测量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

多层梯度多场耦合纳米复合材料的性能分析及优化设计

国家自然科学基金

0+阅读 · 2011年12月31日

REMg2TMx型多相合金的吸/放氢行为和衰减机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员