以K美元-NN美元回归方式选择美元最低差额原则战略 (Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression) - 专知论文

会员服务 ·

0

极小点 · Principle · 可约的 · 模型选择 · 估计/估计量 ·

2021 年 5 月 5 日

Minimum discrepancy principle strategy for choosing $k$ in $k$-NN regression

翻译：以K美元-NN美元回归方式选择美元最低差额原则战略

Yaroslav Averyanov,Alain Celisse

We present a novel data-driven strategy to choose the hyperparameter $k$ in the $k$-NN regression estimator. We treat the problem of choosing the hyperparameter as an iterative procedure (over $k$) and propose using an easily implemented in practice strategy based on the idea of early stopping and the minimum discrepancy principle. This model selection strategy is proven to be minimax-optimal, under the fixed-design assumption on covariates, over some smoothness function classes, for instance, the Lipschitz functions class on a bounded domain. The novel method often improves statistical performance on artificial and real-world data sets in comparison to other model selection strategies, such as the Hold-out method and 5-fold cross-validation. The novelty of the strategy comes from reducing the computational time of the model selection procedure while preserving the statistical (minimax) optimality of the resulting estimator. More precisely, given a sample of size $n$, assuming that the nearest neighbors are already precomputed, if one should choose $k$ among $\left\{ 1, \ldots, n \right\}$, the strategy reduces the computational time of the generalized cross-validation or Akaike's AIC criteria from $\mathcal{O}\left( n^3 \right)$ to $\mathcal{O}\left( n^2 (n - k) \right)$, where $k$ is the proposed (minimum discrepancy principle) value of the nearest neighbors. Code for the simulations is provided at https://github.com/YaroslavAveryanov/Minimum-discrepancy-principle-for-choosing-k.

翻译：我们提出了一个新颖的数据驱动策略, 用于在 $k$- NN 回归验证器中选择超参数 $k$ 。我们把选择超参数的问题当作一个迭代程序( 超过 $k$ ), 并提议根据早期停止的概念和最小差异原则, 在实际操作中使用一个简单实施的战略。这个模式选择策略被证明是小型最大最佳的, 在固定设计假设 COVidates 下, 超越某些平滑功能类别, 比如, Lipschitz 函数类在约束域上运行。新方法通常会提高人造和真实世界数据集的统计性能, 与其他模式选择战略相比, 比如“ 暂停” 和“ 5倍交叉校验” 。战略的新颖性在于缩短模型选择程序的计算时间, 同时保存由此得出的估算值的统计( 最小值) 最佳性。更精确地说, 假设最近的邻居已经提前投入了, 如果在 $\ 美元美元美元 ( rick) 中选择美元 ( ral___ ral3) ral_ ral_ ral_ or_ or_ or_ral_ orma) ormax) 战略, 。

0

相关内容

极小点

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

61+阅读 · 2020年8月6日

【干货书-斯坦福】最优化算法，521页pdf，《Algorithms for Optimization》MIT出版社

【干货书-斯坦福】最优化算法，521页pdf，《Algorithms for Optimization》MIT出版社

专知会员服务

278+阅读 · 2020年7月2日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs

Arxiv

0+阅读 · 2021年6月24日

More Efficient Estimation for Logistic Regression with Optimal Subsample

Arxiv

0+阅读 · 2021年6月23日

A Convex Programming Solution Based Debiased Estimator for Quantile with Missing Response and High-dimensional Covariables

Arxiv

0+阅读 · 2021年6月23日

Near-Optimal Linear Regression under Distribution Shift

Arxiv

0+阅读 · 2021年6月23日

Dynamic covariate balancing: estimating treatment effects over time

Arxiv

0+阅读 · 2021年6月22日

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Arxiv

0+阅读 · 2021年6月22日

Heterogeneous Treatment Effects in Regression Discontinuity Designs

Arxiv

0+阅读 · 2021年6月22日

A stochastic linearized proximal method of multipliers for convex stochastic optimization with expectation constraints

Arxiv

0+阅读 · 2021年6月22日

An Optimal Uniform Concentration Inequality for Discrete Entropies on Finite Alphabets in the High-dimensional Setting

Arxiv

0+阅读 · 2021年6月21日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【干货书】Python程序员编程，810页pdf，Python® for Programmers

【干货书】Python程序员编程，810页pdf，Python® for Programmers

专知会员服务

61+阅读 · 2020年8月6日

【干货书-斯坦福】最优化算法，521页pdf，《Algorithms for Optimization》MIT出版社

【干货书-斯坦福】最优化算法，521页pdf，《Algorithms for Optimization》MIT出版社

专知会员服务

278+阅读 · 2020年7月2日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

随波逐流：Similarity-Adaptive and Discrete Optimization

随波逐流：Similarity-Adaptive and Discrete Optimization

我爱读PAMI

5+阅读 · 2018年2月6日

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

【关关的刷题日记63】Leetcode 111 Minimum Depth of Binary Tree

专知

6+阅读 · 2017年12月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs

Arxiv

0+阅读 · 2021年6月24日

More Efficient Estimation for Logistic Regression with Optimal Subsample

Arxiv

0+阅读 · 2021年6月23日

A Convex Programming Solution Based Debiased Estimator for Quantile with Missing Response and High-dimensional Covariables

Arxiv

0+阅读 · 2021年6月23日

Near-Optimal Linear Regression under Distribution Shift

Arxiv

0+阅读 · 2021年6月23日

Dynamic covariate balancing: estimating treatment effects over time

Arxiv

0+阅读 · 2021年6月22日

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Arxiv

0+阅读 · 2021年6月22日

Heterogeneous Treatment Effects in Regression Discontinuity Designs

Arxiv

0+阅读 · 2021年6月22日

A stochastic linearized proximal method of multipliers for convex stochastic optimization with expectation constraints

Arxiv

0+阅读 · 2021年6月22日

An Optimal Uniform Concentration Inequality for Discrete Entropies on Finite Alphabets in the High-dimensional Setting

Arxiv

0+阅读 · 2021年6月21日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

微信扫码咨询专知VIP会员