通过请假单出错交叉验证检测标签噪音 (Detecting Label Noise via Leave-One-Out Cross Validation) - 专知论文

会员服务 ·

0

留一法 · 交叉验证 · 最大似然估计 · 噪声 · 高斯过程回归 ·

2021 年 3 月 21 日

Detecting Label Noise via Leave-One-Out Cross Validation

翻译：通过请假单出错交叉验证检测标签噪音

Yu-Hang Tang,Yuanran Zhu,Wibe A. de Jong

We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted samples using Gaussian process regression. A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels. Thus, the method effectively applies a sample-specific Tikhonov regularization term, generalizing the uniform regularization prevalent in standard Gaussian process regression. Optimizing the noise model using maximum likelihood estimation leads to the containment of the GPR model's predictive error by the posterior standard deviation in leave-one-out cross-validation. A multiplicative update scheme is proposed for solving the maximum likelihood estimation problem under non-negative constraints. While we provide a proof of monotonic convergence for certain special cases, the multiplicative scheme has empirically demonstrated monotonic convergence behavior in virtually all our numerical experiments. We show that the presented method can pinpoint corrupted samples and lead to better regression models when trained on synthetic and real-world scientific data sets.

翻译：我们提出了一个简单的算法,用高森进程回归法来查明和纠正由清洁和腐败的样本混合而成的、真正有价值的噪音标签。我们采用了一种超小型噪声模型,在这种模型中,与每个和所有观察到的标签都有独立的差异。因此,该方法有效地应用了一个特定样本的Tikhonov正规化术语,在标准高森进程回归法中普遍采用统一规范化。利用最大可能性估计法优化噪音模型,导致GPR模型的后方标准偏差在外方交叉校准中抑制GPR模型的预测错误。提出了一种多复制性更新计划,以解决非负性限制下的最大可能性估算问题。虽然我们为某些特殊案例提供了单一性趋同的证据,但多复制计划在几乎所有的数值实验中都以经验方式证明了单一的趋同行为。我们表明,在对合成和现实世界科学数据集进行培训时,所提出的方法可以确定腐蚀的样品,并导致更好的回归模型。

0

相关内容

留一法

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Robust Estimation of Sparse Precision Matrix using Adaptive Weighted Graphical Lasso Approach

Arxiv

0+阅读 · 2021年5月14日

Inference on function-valued parameters using a restricted score test

Arxiv

0+阅读 · 2021年5月14日

The Dynamics of Gradient Descent for Overparametrized Neural Networks

Arxiv

0+阅读 · 2021年5月13日

Sum-Rate-Distortion Function for Indirect Multiterminal Source Coding in Federated Learning

Sum-Rate-Distortion Function for Indirect Multiterminal Source Coding in Federated Learning

Arxiv

0+阅读 · 2021年5月13日

Joint Community Detection and Rotational Synchronization via Semidefinite Programming

Arxiv

0+阅读 · 2021年5月13日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Clustered Object Detection in Aerial Images

Clustered Object Detection in Aerial Images

Arxiv

5+阅读 · 2019年8月27日

Anomaly DetectionWith Multiple-Hypotheses Predictions

Arxiv

6+阅读 · 2019年1月28日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

VIP会员

文章信息

相关主题

最大似然估计

高斯过程回归

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

【IJCAI2020】统计相关模型，A Complete Characterization of Projectivity for Statistical Relational Models

专知会员服务

20+阅读 · 2020年4月25日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【MIT】时间序列GAN，Subadditivity of Probability Divergences

专知会员服务

63+阅读 · 2020年3月4日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

机器学习相关资源(框架、库、软件)大列表

机器学习相关资源(框架、库、软件)大列表

专知会员服务

40+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Robust Estimation of Sparse Precision Matrix using Adaptive Weighted Graphical Lasso Approach

Arxiv

0+阅读 · 2021年5月14日

Inference on function-valued parameters using a restricted score test

Arxiv

0+阅读 · 2021年5月14日

The Dynamics of Gradient Descent for Overparametrized Neural Networks

Arxiv

0+阅读 · 2021年5月13日

Sum-Rate-Distortion Function for Indirect Multiterminal Source Coding in Federated Learning

Sum-Rate-Distortion Function for Indirect Multiterminal Source Coding in Federated Learning

Arxiv

0+阅读 · 2021年5月13日

Joint Community Detection and Rotational Synchronization via Semidefinite Programming

Arxiv

0+阅读 · 2021年5月13日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Reliable Graph Neural Networks via Robust Aggregation

Arxiv

9+阅读 · 2020年10月29日

Clustered Object Detection in Aerial Images

Clustered Object Detection in Aerial Images

Arxiv

5+阅读 · 2019年8月27日

Anomaly DetectionWith Multiple-Hypotheses Predictions

Arxiv

6+阅读 · 2019年1月28日

Neural source-filter-based waveform model for statistical parametric speech synthesis

Arxiv

4+阅读 · 2018年11月26日

微信扫码咨询专知VIP会员