高多层随机森林的无症状属性 (Asymptotic Properties of High-Dimensional Random Forests) - 专知论文

会员服务 ·

0

随机森林 · 随机森林算法 · 有偏 · 子采样 · 特征空间 ·

2021 年 6 月 18 日

Asymptotic Properties of High-Dimensional Random Forests

翻译：高多层随机森林的无症状属性

Chien-Ming Chi,Patrick Vossler,Yingying Fan,Jinchi Lv

from arxiv, 63 pages, 4 figures

As a flexible nonparametric learning tool, random forests has been widely applied to various real applications with appealing empirical performance, even in the presence of high-dimensional feature space. Unveiling the underlying mechanisms has led to some important recent theoretical results on the consistency of the random forests algorithm and its variants. However, to our knowledge, all existing works concerning random forests consistency under the setting of high dimensionality were done for various modified random forests models where the splitting rules are independent of the response. In light of this, in this paper we derive the consistency rates for the random forests algorithm associated with the sample CART splitting criterion, which is the one used in the original version of the algorithm in Breiman (2001), in a general high-dimensional nonparametric regression setting through a bias-variance decomposition analysis. Our new theoretical results show that random forests can indeed adapt to high dimensionality and allow for discontinuous regression function. Our bias analysis characterizes explicitly how the random forests bias depends on the sample size, tree height, and column subsampling parameter. Some limitations of our current results are also discussed.

翻译：作为一种灵活的非参数学习工具,随机森林已被广泛应用于各种实际应用,并具有有吸引力的经验性表现,即使存在高维特征空间。保持基本机制已导致最近关于随机森林算法及其变异一致性的一些重要理论结果。然而,据我们所知,在高度维度的设置下,所有关于随机森林一致性的现有工作都是针对各种经修改的随机森林模型进行的,这些模型的分层规则独立于应对措施。根据这一点,我们在本文件中得出与样本CART分离标准相关的随机森林算法的一致性率,这是在布雷曼(2001年)的原始算法中使用的,该算法通过偏差分法分析,在一般高维非参数回归环境中使用。我们的新理论结果显示,随机森林确实可以适应高维度,并允许不连续的回归功能。我们所作的偏差分析清楚地说明了随机森林偏差如何取决于样本大小、树高和列子抽取参数。我们目前结果的一些局限性也得到了讨论。

0

相关内容

随机森林

随机森林指的是利用多棵树对样本进行训练并预测的一种分类器。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【WWW2021】REST:关系事件驱动的股票趋势预测

【WWW2021】REST:关系事件驱动的股票趋势预测

专知会员服务

34+阅读 · 2021年3月9日

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

107+阅读 · 2021年2月27日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

已删除

将门创投

9+阅读 · 2017年7月28日

Simple is better: Making Decision Trees faster using random sampling

Simple is better: Making Decision Trees faster using random sampling

Arxiv

0+阅读 · 2021年8月19日

On the Theoretical Properties of the Exchange Algorithm

Arxiv

0+阅读 · 2021年8月19日

Transfer learning of individualized treatment rules from experimental to real-world data

Arxiv

0+阅读 · 2021年8月19日

Robust Inference for High-Dimensional Linear Models via Residual Randomization

Arxiv

0+阅读 · 2021年8月18日

Multidimensional Persistence: Invariants and Parameterization

Arxiv

0+阅读 · 2021年8月17日

FARF: A Fair and Adaptive Random Forests Classifier

Arxiv

0+阅读 · 2021年8月17日

Approximate MDS Property of Linear Codes

Arxiv

0+阅读 · 2021年8月15日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

随机森林算法

相关VIP内容

【WWW2021】REST:关系事件驱动的股票趋势预测

【WWW2021】REST:关系事件驱动的股票趋势预测

专知会员服务

34+阅读 · 2021年3月9日

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

【普林斯顿经典书】高维概率，326页pdf，Probability in High Dimension

专知会员服务

107+阅读 · 2021年2月27日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

【牛津大学】深度学习时间序列预测，Time Series Forecasting With Deep Learning: A Survey

专知会员服务

142+阅读 · 2020年4月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】(Keras)LSTM多元时序预测教程

【推荐】(Keras)LSTM多元时序预测教程

机器学习研究会

24+阅读 · 2017年8月14日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

已删除

将门创投

9+阅读 · 2017年7月28日

相关论文

Simple is better: Making Decision Trees faster using random sampling

Simple is better: Making Decision Trees faster using random sampling

Arxiv

0+阅读 · 2021年8月19日

On the Theoretical Properties of the Exchange Algorithm

Arxiv

0+阅读 · 2021年8月19日

Transfer learning of individualized treatment rules from experimental to real-world data

Arxiv

0+阅读 · 2021年8月19日

Robust Inference for High-Dimensional Linear Models via Residual Randomization

Arxiv

0+阅读 · 2021年8月18日

Multidimensional Persistence: Invariants and Parameterization

Arxiv

0+阅读 · 2021年8月17日

FARF: A Fair and Adaptive Random Forests Classifier

Arxiv

0+阅读 · 2021年8月17日

Approximate MDS Property of Linear Codes

Arxiv

0+阅读 · 2021年8月15日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员