线性回归的调试 (Provable Training Set Debugging for Linear Regression) - 专知论文

会员服务 ·

0

线性的 · 线性回归 · 训练集 · 情景 · 可辨认的 ·

2021 年 8 月 10 日

Provable Training Set Debugging for Linear Regression

翻译：线性回归的调试

Xiaomin Zhang,Xiaojin Zhu,Po-Ling Loh

We investigate problems in penalized $M$-estimation, inspired by applications in machine learning debugging. Data are collected from two pools, one containing data with possibly contaminated labels, and the other which is known to contain only cleanly labeled points. We first formulate a general statistical algorithm for identifying buggy points and provide rigorous theoretical guarantees under the assumption that the data follow a linear model. We then present two case studies to illustrate the results of our general theory and the dependence of our estimator on clean versus buggy points. We further propose an algorithm for tuning parameter selection of our Lasso-based algorithm and provide corresponding theoretical guarantees. Finally, we consider a two-person "game" played between a bug generator and a debugger, where the debugger can augment the contaminated data set with cleanly labeled versions of points in the original data pool. We establish a theoretical result showing a sufficient condition under which the bug generator can always fool the debugger. Nonetheless, we provide empirical results showing that such a situation may not occur in practice, making it possible for natural augmentation strategies combined with our Lasso debugging algorithm to succeed.

翻译：我们根据机器学习调试中的应用,调查了惩罚性估算$M$的问题。数据是从两个集合收集的,一个集合含有可能受到污染的标签数据,另一个集合已知仅含有清洁标签点。我们首先制定用于识别错误点的一般统计算法,并在假设数据遵循线性模型的情况下提供严格的理论保证。我们然后提出两个案例研究,以说明我们的一般理论的结果和我们的估算器对清洁点与错误点的依赖性。我们进一步提出调控我们基于激光索的算法参数选择的算法,并提供相应的理论保证。最后,我们考虑在错误生成器和调试器之间播放的两个人“游戏 ”, 使调试器能够以原始数据库中清洁标签点的版本来增加受污染的数据集。我们建立一个理论结果,显示一个充分的条件,使错误生成器总是能够愚弄调试器。然而,我们提供了经验结果,表明这种情况在实践中可能不会发生,使得自然增强战略与我们的激光调试算法能够成功。

0

相关内容

线性的

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Rule-based Bayesian regression

Rule-based Bayesian regression

Arxiv

0+阅读 · 2021年10月8日

Is Support Set Diversity Necessary for Meta-Learning?

Arxiv

0+阅读 · 2021年10月7日

Efficient Methods for Online Multiclass Logistic Regression

Arxiv

0+阅读 · 2021年10月6日

Factored couplings in multi-marginal optimal transport via difference of convex programming

Arxiv

0+阅读 · 2021年10月6日

Least square estimators in linear regression models under negatively superadditive dependent random observations

Arxiv

0+阅读 · 2021年10月6日

Robust Localization with Bounded Noise: Creating a Superset of the Possible Target Positions via Linear-Fractional Representations

Arxiv

0+阅读 · 2021年10月6日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2021年10月5日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Arxiv

3+阅读 · 2018年11月27日

Visual Object Tracking: The Initialisation Problem

Arxiv

9+阅读 · 2018年5月22日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年11月6日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Rule-based Bayesian regression

Rule-based Bayesian regression

Arxiv

0+阅读 · 2021年10月8日

Is Support Set Diversity Necessary for Meta-Learning?

Arxiv

0+阅读 · 2021年10月7日

Efficient Methods for Online Multiclass Logistic Regression

Arxiv

0+阅读 · 2021年10月6日

Factored couplings in multi-marginal optimal transport via difference of convex programming

Arxiv

0+阅读 · 2021年10月6日

Least square estimators in linear regression models under negatively superadditive dependent random observations

Arxiv

0+阅读 · 2021年10月6日

Robust Localization with Bounded Noise: Creating a Superset of the Possible Target Positions via Linear-Fractional Representations

Arxiv

0+阅读 · 2021年10月6日

Transfer Learning under High-dimensional Generalized Linear Models

Arxiv

0+阅读 · 2021年10月5日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Arxiv

3+阅读 · 2018年11月27日

Visual Object Tracking: The Initialisation Problem

Arxiv

9+阅读 · 2018年5月22日

微信扫码咨询专知VIP会员