交叉验证:它估计什么以及它做得如何? (Cross-validation: what does it estimate and how well does it do it?) - 专知论文

会员服务 ·

0

估计/估计量 · 数据拆分 · 置信度 · 交叉验证 · MoDELS ·

2021 年 4 月 1 日

Cross-validation: what does it estimate and how well does it do it?

翻译：交叉验证:它估计什么以及它做得如何?

Stephen Bates,Trevor Hastie,Robert Tibshirani

Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit to the training data. We prove that this is not the case for the linear model fit by ordinary least squares; rather it estimates the average prediction error of models fit on other unseen training sets drawn from the same population. We further show that this phenomenon occurs for most popular estimates of prediction error, including data splitting, bootstrapping, and Mallow's Cp. Next, the standard confidence intervals for prediction error derived from cross-validation may have coverage far below the desired level. Because each data point is used for both training and testing, there are correlations among the measured accuracies for each fold, and so the usual estimate of variance is too small. We introduce a nested cross-validation scheme to estimate this variance more accurately, and show empirically that this modification leads to intervals with approximately correct coverage in many examples where traditional cross-validation intervals fail. Lastly, our analysis also shows that when producing confidence intervals for prediction accuracy with simple data splitting, one should not re-fit the model on the combined data, since this invalidates the confidence intervals.

翻译：交叉校准是用来估计预测误差的一种广泛使用的方法,但其行为是复杂和不完全理解的。理想的是,人们会认为交叉校准估计手头模型的预测误差符合培训数据。我们证明,对于适合普通最小方格的线性模型来说,情况并非如此; 而是估计适合同一人群的其他无形培训组的模型的平均预测误差。我们进一步显示,这种现象发生在大多数流行的预测误差估计中,包括数据分离、靴式和Mallow的Cp。其次,交叉校准产生的预测误差标准信任间隔的覆盖面可能远远低于理想水平。由于每个数据点都用于培训和测试,每个折叠的测度与每个折数之间都有关联,因此通常的差异估计太小。我们采用了嵌套的交叉校准办法,更准确地估计这一差异,并用经验显示,在传统交叉校准间隔期间失败的许多例子中,这种修改导致大致正确的间隔。最后,我们的分析还表明,由于每个数据都用于培训和测试,因此,自这一精确度预测产生一次数据以来,这种测空的模型也表明,这种数据应该不断更新。

0

相关内容

估计/估计量

估计/估计量

【斯坦福CS224W】知识图谱推理，84页ppt

【斯坦福CS224W】知识图谱推理，84页ppt

专知会员服务

121+阅读 · 2021年2月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【Thomas G. Dietterich】机器“理解”意味着什么?（What does it mean for a machine to “understand”?）

专知会员服务

9+阅读 · 2020年1月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

Dynamic Network selection for the Object Detection task: why it matters and what we (didn't) achieve

Arxiv

0+阅读 · 2021年5月27日

Dimension-Free Empirical Entropy Estimation

Dimension-Free Empirical Entropy Estimation

Arxiv

0+阅读 · 2021年5月27日

Rethinking InfoNCE: How Many Negative Samples Do You Need?

Arxiv

0+阅读 · 2021年5月27日

Bayesian Origin-Destination Estimation in Networked Transit Systems using Nodal In- and Outflow Counts

Arxiv

0+阅读 · 2021年5月26日

Bayes Factor Asymptotics for Variable Selection in the Gaussian Process Framework

Bayes Factor Asymptotics for Variable Selection in the Gaussian Process Framework

Arxiv

0+阅读 · 2021年5月26日

Bias in Machine Learning Software: Why? How? What to do?

Arxiv

0+阅读 · 2021年5月25日

Statistical power for cluster analysis

Arxiv

0+阅读 · 2021年5月25日

On robust learning in the canonical change point problem under heavy tailed errors in finite and growing dimensions

Arxiv

0+阅读 · 2021年5月25日

How do you correct run-on sentences it's not as easy as it seems

Arxiv

4+阅读 · 2018年9月21日

Signal Processing and Piecewise Convex Estimation

Arxiv

4+阅读 · 2018年3月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【斯坦福CS224W】知识图谱推理，84页ppt

【斯坦福CS224W】知识图谱推理，84页ppt

专知会员服务

121+阅读 · 2021年2月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【Thomas G. Dietterich】机器“理解”意味着什么?（What does it mean for a machine to “understand”?）

专知会员服务

9+阅读 · 2020年1月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】迈向具有高维结果的可靠且稳健的因果推断

《美海军分布式海上作战（DMO）概念：最新情况》

Gemini 2.5：推动前沿，具备先进推理、多模态、长上下文及下一代智能体能力

【ICML2025教程】联想记忆的现代方法

相关资讯

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【推荐】用Python/OpenCV实现增强现实

【推荐】用Python/OpenCV实现增强现实

机器学习研究会

15+阅读 · 2017年11月16日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

相关论文

Dynamic Network selection for the Object Detection task: why it matters and what we (didn't) achieve

Arxiv

0+阅读 · 2021年5月27日

Dimension-Free Empirical Entropy Estimation

Dimension-Free Empirical Entropy Estimation

Arxiv

0+阅读 · 2021年5月27日

Rethinking InfoNCE: How Many Negative Samples Do You Need?

Arxiv

0+阅读 · 2021年5月27日

Bayesian Origin-Destination Estimation in Networked Transit Systems using Nodal In- and Outflow Counts

Arxiv

0+阅读 · 2021年5月26日

Bayes Factor Asymptotics for Variable Selection in the Gaussian Process Framework

Bayes Factor Asymptotics for Variable Selection in the Gaussian Process Framework

Arxiv

0+阅读 · 2021年5月26日

Bias in Machine Learning Software: Why? How? What to do?

Arxiv

0+阅读 · 2021年5月25日

Statistical power for cluster analysis

Arxiv

0+阅读 · 2021年5月25日

On robust learning in the canonical change point problem under heavy tailed errors in finite and growing dimensions

Arxiv

0+阅读 · 2021年5月25日

How do you correct run-on sentences it's not as easy as it seems

Arxiv

4+阅读 · 2018年9月21日

Signal Processing and Piecewise Convex Estimation

Arxiv

4+阅读 · 2018年3月14日

微信扫码咨询专知VIP会员