Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization - 专知论文

会员服务 ·

0

稳健性 · 梯度提升决策树 · 正则化项 · 独热 · 线性的 ·

2023 年 5 月 11 日

Enhancing Robustness of Gradient-Boosted Decision Trees through One-Hot Encoding and Regularization

翻译：暂无翻译

Shijie Cui,Agus Sudjianto,Aijun Zhang,Runze Li

Gradient-boosted decision trees (GBDT) are widely used and highly effective machine learning approach for tabular data modeling. However, their complex structure may lead to low robustness against small covariate perturbation in unseen data. In this study, we apply one-hot encoding to convert a GBDT model into a linear framework, through encoding of each tree leaf to one dummy variable. This allows for the use of linear regression techniques, plus a novel risk decomposition for assessing the robustness of a GBDT model against covariate perturbations. We propose to enhance the robustness of GBDT models by refitting their linear regression forms with $L_1$ or $L_2$ regularization. Theoretical results are obtained about the effect of regularization on the model performance and robustness. It is demonstrated through numerical experiments that the proposed regularization approach can enhance the robustness of the one-hot-encoded GBDT models.

翻译：暂无翻译

0

相关内容

稳健性

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

复合石墨烯负载纳米双金属催化剂的结构调控及其ORR催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

自载型有序介孔非贵金属-氮-碳燃料电池阴极氧还原催化材料

国家自然科学基金

0+阅读 · 2013年12月31日

Trx1/FOXO1信号通路调控肝癌耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子Slug体内调控前列腺癌生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

镁合金变形织构的调控及强韧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MAD2及其选择性剪切体MAD2β22312;人胃癌干细胞中的表达及其对胃癌干细胞耐药调控机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型纳米粒子运载PSMA启动子/增强子-Stat3-siRNA-GRIM-19质粒进行前列腺癌综合靶向治疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning

Arxiv

0+阅读 · 2023年6月28日

Exploring weight initialization, diversity of solutions, and degradation in recurrent neural networks trained for temporal and decision-making tasks

Arxiv

0+阅读 · 2023年6月28日

Disentangled Variational Auto-encoder Enhanced by Counterfactual Data for Debiasing Recommendation

Arxiv

0+阅读 · 2023年6月28日

Effective resistance in metric spaces

Effective resistance in metric spaces

Arxiv

0+阅读 · 2023年6月27日

Balanced Encoding of Near-Zero Correlation for an AES Implementation

Arxiv

0+阅读 · 2023年6月27日

Double-Iterative Gaussian Process Regression for Modeling Error Compensation in Autonomous Racing

Arxiv

0+阅读 · 2023年6月26日

Enhancing Adversarial Training via Reweighting Optimization Trajectory

Arxiv

0+阅读 · 2023年6月25日

Large Sequence Models for Sequential Decision-Making: A Survey

Arxiv

14+阅读 · 2023年6月24日

Estimating the Value of Evidence-Based Decision Making

Arxiv

0+阅读 · 2023年6月21日

Tensor Decompositions for temporal knowledge base completion

Arxiv

10+阅读 · 2020年4月10日

VIP会员

文章信息

相关主题

梯度提升决策树

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【普林斯顿博士论文】以奖励推动生成式人工智能的发展：奖励引导生成的理论与方法

中文版 | 火力支援与巡飞弹药的未来（附原文）

中文版 | 人工智能时代的任务式指挥

扩散模型中的 Transformer：图像生成及其延展应用询问 ChatGPT

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning

Arxiv

0+阅读 · 2023年6月28日

Exploring weight initialization, diversity of solutions, and degradation in recurrent neural networks trained for temporal and decision-making tasks

Arxiv

0+阅读 · 2023年6月28日

Disentangled Variational Auto-encoder Enhanced by Counterfactual Data for Debiasing Recommendation

Arxiv

0+阅读 · 2023年6月28日

Effective resistance in metric spaces

Effective resistance in metric spaces

Arxiv

0+阅读 · 2023年6月27日

Balanced Encoding of Near-Zero Correlation for an AES Implementation

Arxiv

0+阅读 · 2023年6月27日

Double-Iterative Gaussian Process Regression for Modeling Error Compensation in Autonomous Racing

Arxiv

0+阅读 · 2023年6月26日

Enhancing Adversarial Training via Reweighting Optimization Trajectory

Arxiv

0+阅读 · 2023年6月25日

Large Sequence Models for Sequential Decision-Making: A Survey

Arxiv

14+阅读 · 2023年6月24日

Estimating the Value of Evidence-Based Decision Making

Arxiv

0+阅读 · 2023年6月21日

Tensor Decompositions for temporal knowledge base completion

Arxiv

10+阅读 · 2020年4月10日

相关基金

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

复合石墨烯负载纳米双金属催化剂的结构调控及其ORR催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

自载型有序介孔非贵金属-氮-碳燃料电池阴极氧还原催化材料

国家自然科学基金

0+阅读 · 2013年12月31日

Trx1/FOXO1信号通路调控肝癌耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子Slug体内调控前列腺癌生长的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

镁合金变形织构的调控及强韧化机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MAD2及其选择性剪切体MAD2β22312;人胃癌干细胞中的表达及其对胃癌干细胞耐药调控机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

新型纳米粒子运载PSMA启动子/增强子-Stat3-siRNA-GRIM-19质粒进行前列腺癌综合靶向治疗研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员