任意森林的内插是否良性? (Is interpolation benign for random forests?) - 专知论文

会员服务 ·

0

随机森林 · MoDELS · 泛化理论 · 自助法/自举法 · CASE ·

2022 年 2 月 8 日

Is interpolation benign for random forests?

翻译：任意森林的内插是否良性?

Ludovic Arnould,Claire Boyer,Erwan Scornet

Statistical wisdom suggests that very complex models, interpolating training data, will be poor at prediction on unseen examples. Yet, this aphorism has been recently challenged by the identification of benign overfitting regimes, specially studied in the case of parametric models: generalization capabilities may be preserved despite model high complexity. While it is widely known that fully-grown decision trees interpolate and, in turn, have bad predictive performances, the same behavior is yet to be analyzed for random forests. In this paper, we study the trade-off between interpolation and consistency for several types of random forest algorithms. Theoretically, we prove that interpolation regimes and consistency cannot be achieved for non-adaptive random forests. Since adaptivity seems to be the cornerstone to bring together interpolation and consistency, we introduce and study interpolating Adaptive Centered Forests, which are proved to be consistent in a noiseless scenario. Numerical experiments show that Breiman's random forests are consistent while exactly interpolating, when no bootstrap step is involved. We theoretically control the size of the interpolation area, which converges fast enough to zero, so that exact interpolation and consistency occur in conjunction.

翻译：统计智慧表明,非常复杂的模型,即培训数据的内插性,在对不可见的例子进行预测时将很难预测。然而,这一词论最近因确定无害的过度适应制度而遇到挑战,在参数模型中特别研究过:尽管模型复杂程度很高,一般能力还是可以保存的。虽然众所周知,成熟的决策树的内插性,反过来又具有不良的预测性,但对随机森林的同一行为尚有待分析。在本文中,我们研究若干类型的随机森林算法的内插性和一致性之间的权衡。理论上,我们证明非适应性随机森林的内插性和一致性是无法实现的。由于适应性似乎是将内插性和一致性结合在一起的基石,我们引入和研究内插性中心森林,这在无噪音的假设中证明是一致的。数字实验表明,布雷曼的随机森林是一致的,而精确的内插性,而没有靴系步骤。我们理论上控制了内插性区域的大小,这种内插性是接近零的,因此精确的内插性。

0

相关内容

随机森林

随机森林指的是利用多棵树对样本进行训练并预测的一种分类器。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于混合Petri网的电力CPS协同建模与分析

国家自然科学基金

2+阅读 · 2013年12月31日

抗代数攻击的单输出和抗差分攻击的多输出密码函数研究

国家自然科学基金

0+阅读 · 2013年12月31日

多元对偶小波框架的提升构造及其在图像去噪中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

数字控制DC-DC变换器的非线性建模与单输入模糊PID控制算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

对称密码中的非线性函数设计与分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于小波和几何小波的脉冲星信号处理与天文图像去噪

国家自然科学基金

0+阅读 · 2011年12月31日

数据缺失时高维数据降维分析的方法、理论与应用

国家自然科学基金

1+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

Componentwise perturbation analysis for the generalized Schur decomposition

Arxiv

0+阅读 · 2022年4月20日

Adaptive Non-linear Filtering Technique for Image Restoration

Arxiv

1+阅读 · 2022年4月20日

Image Restoration in Non-Linear Filtering Domain using MDB approach

Arxiv

0+阅读 · 2022年4月20日

GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Arxiv

0+阅读 · 2022年4月19日

An Efficient Algorithm for the Proximity Connected Two Center Problem

Arxiv

0+阅读 · 2022年4月19日

Benign Overfitting in Time Series Linear Model with Over-Parameterization

Benign Overfitting in Time Series Linear Model with Over-Parameterization

Arxiv

0+阅读 · 2022年4月18日

Automated Test Generation for REST APIs: No Time to Rest Yet

Automated Test Generation for REST APIs: No Time to Rest Yet

Arxiv

0+阅读 · 2022年4月18日

Optimal Conformal Prediction for Small Areas

Arxiv

0+阅读 · 2022年4月18日

This is the Moment for Probabilistic Loops

Arxiv

0+阅读 · 2022年4月14日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】多目标奖励与偏好优化：理论与算法

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

自主化海军：海上无人系统与未来海战

迈向智能体系统规模化的科学

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Componentwise perturbation analysis for the generalized Schur decomposition

Arxiv

0+阅读 · 2022年4月20日

Adaptive Non-linear Filtering Technique for Image Restoration

Arxiv

1+阅读 · 2022年4月20日

Image Restoration in Non-Linear Filtering Domain using MDB approach

Arxiv

0+阅读 · 2022年4月20日

GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Arxiv

0+阅读 · 2022年4月19日

An Efficient Algorithm for the Proximity Connected Two Center Problem

Arxiv

0+阅读 · 2022年4月19日

Benign Overfitting in Time Series Linear Model with Over-Parameterization

Benign Overfitting in Time Series Linear Model with Over-Parameterization

Arxiv

0+阅读 · 2022年4月18日

Automated Test Generation for REST APIs: No Time to Rest Yet

Automated Test Generation for REST APIs: No Time to Rest Yet

Arxiv

0+阅读 · 2022年4月18日

Optimal Conformal Prediction for Small Areas

Arxiv

0+阅读 · 2022年4月18日

This is the Moment for Probabilistic Loops

Arxiv

0+阅读 · 2022年4月14日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

相关基金

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

基于混合Petri网的电力CPS协同建模与分析

国家自然科学基金

2+阅读 · 2013年12月31日

抗代数攻击的单输出和抗差分攻击的多输出密码函数研究

国家自然科学基金

0+阅读 · 2013年12月31日

多元对偶小波框架的提升构造及其在图像去噪中的应用

国家自然科学基金

0+阅读 · 2013年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

数字控制DC-DC变换器的非线性建模与单输入模糊PID控制算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

对称密码中的非线性函数设计与分析

国家自然科学基金

0+阅读 · 2012年12月31日

基于小波和几何小波的脉冲星信号处理与天文图像去噪

国家自然科学基金

0+阅读 · 2011年12月31日

数据缺失时高维数据降维分析的方法、理论与应用

国家自然科学基金

1+阅读 · 2011年12月31日

约化群酉表示的branching law及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员