向极端四分位回归的渐变加速 (Gradient boosting for extreme quantile regression) - 专知论文

会员服务 ·

0

估计/估计量 · Boosting（一种模型训练加速方式） · 值域 · 预测器/决策函数 · Performer ·

2021 年 3 月 1 日

Gradient boosting for extreme quantile regression

翻译：向极端四分位回归的渐变加速

Jasper Velthoen,Clément Dombry,Juan-Juan Cai,Sebastian Engelke

Extreme quantile regression provides estimates of conditional quantiles outside the range of the data. Classical methods such as quantile random forests perform poorly in such cases since data in the tail region are too scarce. Extreme value theory motivates to approximate the conditional distribution above a high threshold by a generalized Pareto distribution with covariate dependent parameters. This model allows for extrapolation beyond the range of observed values and estimation of conditional extreme quantiles. We propose a gradient boosting procedure to estimate a conditional generalized Pareto distribution by minimizing its deviance. Cross-validation is used for the choice of tuning parameters such as the number of trees and the tree depths. We discuss diagnostic plots such as variable importance and partial dependence plots, which help to interpret the fitted models. In simulation studies we show that our gradient boosting procedure outperforms classical methods from quantile regression and extreme value theory, especially for high-dimensional predictor spaces and complex parameter response surfaces. An application to statistical post-processing of weather forecasts with precipitation data in the Netherlands is proposed.

翻译：极端孔径回归提供了数据范围外的有条件孔径的估计数。典型的方法, 如四分位随机森林在这类情况下表现不佳, 因为尾端区域的数据太稀少。极端价值理论促使通过泛泛Paresto分布, 以共变依赖参数, 接近高于高阈值的有条件分布。这个模型允许在观察到的数值范围以外进行外推和估计有条件极端孔径的极端孔径值。我们提议了一个梯度推动程序, 以通过尽量减少其偏差来估计有条件的普遍帕雷托分布。交叉校验用于选择调试参数, 如树木和树深度的数量。我们讨论诊断图, 如可变重要性和部分依赖图, 帮助解释合适的模型。在模拟研究中, 我们显示, 我们的梯度提法比典型方法超越了孔径回归和极端数值理论的范围, 特别是对于高度预测空间和复杂的参数响应表面。提议对荷兰降水数据的天气后处理进行统计应用。

0

相关内容

估计/估计量

估计/估计量

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

专知会员服务

52+阅读 · 2020年6月21日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

干货 | 详解scikit-learn中随机森林(RF)和梯度提升决策树(GBDT)的参数调优

干货 | 详解scikit-learn中随机森林(RF)和梯度提升决策树(GBDT)的参数调优

机器学习算法与Python学习

6+阅读 · 2017年7月26日

Prediction in latent factor regression: Adaptive PCR and beyond

Arxiv

0+阅读 · 2021年4月23日

Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap

Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap

Arxiv

0+阅读 · 2021年4月22日

Bayesian inversion for unified ductile phase-field fracture

Arxiv

0+阅读 · 2021年4月22日

Dynamic cyber risk estimation with Competitive Quantile Autoregression

Dynamic cyber risk estimation with Competitive Quantile Autoregression

Arxiv

0+阅读 · 2021年4月22日

Robust Kernel-based Distribution Regression

Arxiv

0+阅读 · 2021年4月21日

Reproducing Kernel Methods for Nonparametric and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年4月21日

Asymmetric linear double autoregression

Arxiv

0+阅读 · 2021年4月21日

Modeling sign concordance of quantile regression residuals with multiple outcomes

Arxiv

0+阅读 · 2021年4月21日

Constrained Bayesian Hierarchical Models for Gaussian Data: A Model Selection Criterion Approach

Arxiv

0+阅读 · 2021年4月20日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

VIP会员

文章信息

相关主题

估计/估计量

Boosting（一种模型训练加速方式）

预测器/决策函数

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

111+阅读 · 2020年11月12日

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

专知会员服务

52+阅读 · 2020年6月21日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

逻辑回归（Logistic Regression）模型简介

逻辑回归（Logistic Regression）模型简介

全球人工智能

5+阅读 · 2017年11月1日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

干货 | 详解scikit-learn中随机森林(RF)和梯度提升决策树(GBDT)的参数调优

干货 | 详解scikit-learn中随机森林(RF)和梯度提升决策树(GBDT)的参数调优

机器学习算法与Python学习

6+阅读 · 2017年7月26日

相关论文

Prediction in latent factor regression: Adaptive PCR and beyond

Arxiv

0+阅读 · 2021年4月23日

Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap

Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap

Arxiv

0+阅读 · 2021年4月22日

Bayesian inversion for unified ductile phase-field fracture

Arxiv

0+阅读 · 2021年4月22日

Dynamic cyber risk estimation with Competitive Quantile Autoregression

Dynamic cyber risk estimation with Competitive Quantile Autoregression

Arxiv

0+阅读 · 2021年4月22日

Robust Kernel-based Distribution Regression

Arxiv

0+阅读 · 2021年4月21日

Reproducing Kernel Methods for Nonparametric and Semiparametric Treatment Effects

Arxiv

0+阅读 · 2021年4月21日

Asymmetric linear double autoregression

Arxiv

0+阅读 · 2021年4月21日

Modeling sign concordance of quantile regression residuals with multiple outcomes

Arxiv

0+阅读 · 2021年4月21日

Constrained Bayesian Hierarchical Models for Gaussian Data: A Model Selection Criterion Approach

Arxiv

0+阅读 · 2021年4月20日

Efficient and Effective $L_0$ Feature Selection

Efficient and Effective $L_0$ Feature Selection

Arxiv

5+阅读 · 2018年8月7日

微信扫码咨询专知VIP会员