禁止的知识和专门培训:线性倒退中两种主要超容来源的万变图解解决方案 (Forbidden Knowledge and Specialized Training: A Versatile Solution for the Two Main Sources of Overfitting in Linear Regression) - 专知论文

会员服务 ·

0

知识 (knowledge) · 估计/估计量 · 线性回归 · 预测器/决策函数 · 过拟合 ·

2022 年 9 月 3 日

Forbidden Knowledge and Specialized Training: A Versatile Solution for the Two Main Sources of Overfitting in Linear Regression

翻译：禁止的知识和专门培训:线性倒退中两种主要超容来源的万变图解解决方案

Overfitting in linear regression is broken down into two main causes. First, the formula for the estimator includes 'forbidden knowledge' about training observations' residuals, and it loses this advantage when deployed out-of-sample. Second, the estimator has 'specialized training' that makes it particularly capable of explaining movements in the predictors that are idiosyncratic to the training sample. An out-of-sample counterpart is introduced to the popular 'leverage' measure of training observations' importance. A new method is proposed to forecast out-of-sample fit at the time of deployment, when the values for the predictors are known but the true outcome variable is not. In Monte Carlo simulations and in an empirical application using MRI brain scans, the proposed estimator performs comparably to Predicted Residual Error Sum of Squares (PRESS) for the average out-of-sample case and unlike PRESS, also performs consistently across different test samples, even those that differ substantially from the training set.

翻译：线性回归的超称分解为两大原因。首先, 估计值公式包括培训观察剩余部分的“ 禁止知识”, 并且在部署时失去这种优势。第二, 估计值的“ 专门培训” 使其特别能够解释预测器中与培训样本具有独特性的变化。引入了流行的“ 利用” 培训观测“ 测量“ ” 的重要性。提出了一种新的方法, 在部署时预测值为已知但真正结果变量不为人知的情况下, 预测出样, 从而在部署时会失去这种优势。在蒙特卡洛的模拟中, 以及在使用 MRI 脑扫描的经验应用中, 拟议的估计值与平均抽样“ 利用” 和“ 打印” 不同, 也在不同测试样本中一致地显示, 即使那些与培训内容大不相同的样本。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂对准二维s波超导纳米器件性能影响的理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

一种新型蛋白酶在耐辐射球菌电离辐射抗性中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁体多场耦合非线性力学行为与超导性能的相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

大气细颗粒物（PM2.5）高浓度污染预测技术方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Generating Natural Language Proofs with Verifier-Guided Search

Generating Natural Language Proofs with Verifier-Guided Search

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

Arxiv

0+阅读 · 2022年10月17日

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

Arxiv

0+阅读 · 2022年10月14日

QMRNet: Quality Metric Regression for EO Image Quality Assessment and Super-Resolution

Arxiv

0+阅读 · 2022年10月14日

VIP会员

文章信息

相关主题

知识 (knowledge)

估计/估计量

预测器/决策函数

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【MIT博士论文】弱监督学习：理论、方法与应用

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

锚定情报：合成欺骗时代的地面真相

NeurIPS 2025 | NMKE：基于神经元归因与动态稀疏掩码的终身知识编辑

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Generating Natural Language Proofs with Verifier-Guided Search

Generating Natural Language Proofs with Verifier-Guided Search

Arxiv

0+阅读 · 2022年10月18日

Small Area Estimation using EBLUPs under the Nested Error Regression Model

Arxiv

0+阅读 · 2022年10月18日

PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks

Arxiv

0+阅读 · 2022年10月17日

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

Arxiv

0+阅读 · 2022年10月14日

QMRNet: Quality Metric Regression for EO Image Quality Assessment and Super-Resolution

Arxiv

0+阅读 · 2022年10月14日

相关基金

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

掺杂对准二维s波超导纳米器件性能影响的理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

一种新型蛋白酶在耐辐射球菌电离辐射抗性中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

超导磁体多场耦合非线性力学行为与超导性能的相互作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

大气细颗粒物（PM2.5）高浓度污染预测技术方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员