线性倒退 (Distribution-Free Robust Linear Regression) - 专知论文

会员服务 ·

0

线性回归 · 估计/估计量 · 线性的 · 优化器 · 稳健性 ·

2021 年 10 月 21 日

Distribution-Free Robust Linear Regression

翻译：线性倒退

Jaouad Mourtada,Tomas Vaškevičius,Nikita Zhivotovskiy

from arxiv, 29 pages, to appear in Mathematical Statistics and Learning

We study random design linear regression with no assumptions on the distribution of the covariates and with a heavy-tailed response variable. In this distribution-free regression setting, we show that boundedness of the conditional second moment of the response given the covariates is a necessary and sufficient condition for achieving nontrivial guarantees. As a starting point, we prove an optimal version of the classical in-expectation bound for the truncated least squares estimator due to Gy\"{o}rfi, Kohler, Krzy\.{z}ak, and Walk. However, we show that this procedure fails with constant probability for some distributions despite its optimal in-expectation performance. Then, combining the ideas of truncated least squares, median-of-means procedures, and aggregation theory, we construct a non-linear estimator achieving excess risk of order $d/n$ with an optimal sub-exponential tail. While existing approaches to linear regression for heavy-tailed distributions focus on proper estimators that return linear functions, we highlight that the improperness of our procedure is necessary for attaining nontrivial guarantees in the distribution-free setting.

翻译：我们研究随机设计线性回归,没有关于共差分布的假设,也没有重尾反应变量。但是,在这种无分配回归设置中,我们显示,考虑到共差,有条件响应的第二个时刻的界限性是达到非三重保证的一个必要和充分的条件。作为起点,我们证明,由于Gy\"{o}rfi、Kohler、Krzy\.{z}ak和Walk,我们为短程最小正方块估计值的超大风险建立了非线性估计值的最佳版本。虽然重尾分配线性回归的现有方法侧重于返回直线函数的正确估计值,但我们显示,尽管该程序在一些分配方面有最佳的概率,但某些分配却经常失败。然后,结合短程最小平方块、中值程序以及集成理论,我们建造了一个非线性估计值超大风险的非线性估计值,而最优的亚反差尾巴。我们强调,在返回直线性函数的正确估计值分配过程中,我们的程序的不当性保证是达到必要的非直线性分布。

0

相关内容

线性回归

线性回归是利用数理统计中回归分析，来确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法，运用十分广泛。其表达形式为y = w'x+e，e为误差服从均值为0的正态分布。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

专知会员服务

17+阅读 · 2020年3月21日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

已删除

将门创投

5+阅读 · 2019年4月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Logistic回归第二弹——Softmax Regression

Logistic回归第二弹——Softmax Regression

机器学习深度学习实战原创交流

9+阅读 · 2015年10月29日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

Correlated Product of Experts for Sparse Gaussian Process Regression

Arxiv

0+阅读 · 2021年12月17日

Moments and random number generation for the truncated elliptical family of distributions

Arxiv

0+阅读 · 2021年12月17日

Nonparametric empirical Bayes estimation based on generalized Laguerre series

Arxiv

0+阅读 · 2021年12月16日

Linear Regression, Covariate Selection and the Failure of Modelling

Arxiv

0+阅读 · 2021年12月16日

Budget-limited distribution learning in multifidelity problems

Arxiv

0+阅读 · 2021年12月16日

Simultaneous Sieve Inference for Time-Inhomogeneous Nonlinear Time Series Regression

Arxiv

0+阅读 · 2021年12月16日

Gaussian and Student's $t$ mixture vector autoregressive model

Arxiv

0+阅读 · 2021年12月15日

Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年12月14日

How Good are Low-Rank Approximations in Gaussian Process Regression?

Arxiv

0+阅读 · 2021年12月14日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

专知会员服务

17+阅读 · 2020年3月21日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

已删除

将门创投

5+阅读 · 2019年4月29日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

推荐｜Andrew Ng计算机视觉教程总结

推荐｜Andrew Ng计算机视觉教程总结

全球人工智能

3+阅读 · 2017年11月23日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Logistic回归第二弹——Softmax Regression

Logistic回归第二弹——Softmax Regression

机器学习深度学习实战原创交流

9+阅读 · 2015年10月29日

Logistic回归第一弹——二项Logistic Regression

Logistic回归第一弹——二项Logistic Regression

机器学习深度学习实战原创交流

3+阅读 · 2015年10月22日

相关论文

Correlated Product of Experts for Sparse Gaussian Process Regression

Arxiv

0+阅读 · 2021年12月17日

Moments and random number generation for the truncated elliptical family of distributions

Arxiv

0+阅读 · 2021年12月17日

Nonparametric empirical Bayes estimation based on generalized Laguerre series

Arxiv

0+阅读 · 2021年12月16日

Linear Regression, Covariate Selection and the Failure of Modelling

Arxiv

0+阅读 · 2021年12月16日

Budget-limited distribution learning in multifidelity problems

Arxiv

0+阅读 · 2021年12月16日

Simultaneous Sieve Inference for Time-Inhomogeneous Nonlinear Time Series Regression

Arxiv

0+阅读 · 2021年12月16日

Gaussian and Student's $t$ mixture vector autoregressive model

Arxiv

0+阅读 · 2021年12月15日

Generalized Kernel Ridge Regression for Nonparametric Structural Functions and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年12月14日

How Good are Low-Rank Approximations in Gaussian Process Regression?

Arxiv

0+阅读 · 2021年12月14日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员