通用促进林 (Generalised Boosted Forests) - 专知论文

会员服务 ·

0

估计/估计量 · 随机森林 · 可约的 · Boosting（一种模型训练加速方式） · 对数似然 ·

2021 年 2 月 24 日

Generalised Boosted Forests

翻译：通用促进林

Indrayudh Ghosal,Giles Hooker

from arxiv, Paper: 14 pages, 4 figures, 3 tables; Appendix: 34 pages, 28 figures, 1 table

This paper extends recent work on boosting random forests to model non-Gaussian responses. Given an exponential family $\mathbb{E}[Y|X] = g^{-1}(f(X))$ our goal is to obtain an estimate for $f$. We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We call the sum of these three estimators a \textit{generalised boosted forest}. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric. We also provide a variance estimator, which we can obtain with the same computational cost as the original estimate itself. Empirical experiments on real-world data and simulations demonstrate that the methods can effectively reduce bias, and that confidence interval coverage is conservative in the bulk of the covariate distribution.

翻译：本文扩展了最近关于将随机森林推向非加西语响应模型的工作。根据指数式家庭 $\mathbb{E}[Y ⁇ X] = g ⁇ -1}(f(X)), 我们的目标是获得美元估计数。我们从链接空间的 MLE 类型估算开始, 然后从中定义一般的残留物。我们用这些残留物和一些相应的重量来适应一个基准随机森林, 然后重复同样的重量来获取随机森林。我们称这三个估计者的总和为 \ text{clectenized Profed Form} 。我们用模拟和真实的数据显示, 随机森林步骤减少了测试设定的日志相似性, 我们把它当作我们的主要指标。我们还提供了差异估计器, 我们可以用与原始估算本身相同的计算成本来获得。真实世界数据和模拟的实验表明, 方法可以有效减少偏差, 而信任间隔范围在共位分配中是保守的。

0

相关内容

估计/估计量

估计/估计量

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

最新《深度卷积神经网络理论》报告，35页ppt

最新《深度卷积神经网络理论》报告，35页ppt

专知会员服务

47+阅读 · 2020年11月30日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Targeted Principal Components Regression

Arxiv

0+阅读 · 2021年4月19日

Non-parametric Quantile Regression via the K-NN Fused Lasso

Arxiv

0+阅读 · 2021年4月19日

Bayesian Estimation of Two-Part Joint Models for a Longitudinal Semicontinuous Biomarker and a Terminal Event with R-INLA: Interests for Cancer Clinical Trial Evaluation

Arxiv

0+阅读 · 2021年4月19日

Least Squares with Error in Variables

Arxiv

0+阅读 · 2021年4月18日

Regularized Maximum Likelihood Estimation for the Random Coefficients Model

Arxiv

0+阅读 · 2021年4月16日

Random Persistence Diagram Generation

Arxiv

0+阅读 · 2021年4月15日

Fitting a manifold of large reach to noisy data

Fitting a manifold of large reach to noisy data

Arxiv

0+阅读 · 2021年4月15日

Robust Generalised Bayesian Inference for Intractable Likelihoods

Arxiv

0+阅读 · 2021年4月15日

Hyperspherical Variational Auto-Encoders

Hyperspherical Variational Auto-Encoders

Arxiv

4+阅读 · 2018年9月26日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

VIP会员

文章信息

相关主题

估计/估计量

Boosting（一种模型训练加速方式）

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

最新《深度卷积神经网络理论》报告，35页ppt

最新《深度卷积神经网络理论》报告，35页ppt

专知会员服务

47+阅读 · 2020年11月30日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

5+阅读 · 2018年7月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

算法｜随机森林（Random Forest）

算法｜随机森林（Random Forest）

全球人工智能

3+阅读 · 2018年1月8日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Targeted Principal Components Regression

Arxiv

0+阅读 · 2021年4月19日

Non-parametric Quantile Regression via the K-NN Fused Lasso

Arxiv

0+阅读 · 2021年4月19日

Bayesian Estimation of Two-Part Joint Models for a Longitudinal Semicontinuous Biomarker and a Terminal Event with R-INLA: Interests for Cancer Clinical Trial Evaluation

Arxiv

0+阅读 · 2021年4月19日

Least Squares with Error in Variables

Arxiv

0+阅读 · 2021年4月18日

Regularized Maximum Likelihood Estimation for the Random Coefficients Model

Arxiv

0+阅读 · 2021年4月16日

Random Persistence Diagram Generation

Arxiv

0+阅读 · 2021年4月15日

Fitting a manifold of large reach to noisy data

Fitting a manifold of large reach to noisy data

Arxiv

0+阅读 · 2021年4月15日

Robust Generalised Bayesian Inference for Intractable Likelihoods

Arxiv

0+阅读 · 2021年4月15日

Hyperspherical Variational Auto-Encoders

Hyperspherical Variational Auto-Encoders

Arxiv

4+阅读 · 2018年9月26日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

微信扫码咨询专知VIP会员