定量- bnB: 具有连续特性的最佳决策树的可缩放分支和组合法 (Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features) - 专知论文

会员服务 ·

0

优化器 · Continuity · Learning · 离散化 · 讲稿 ·

2022 年 6 月 23 日

Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features

翻译：定量- bnB: 具有连续特性的最佳决策树的可缩放分支和组合法

Rahul Mazumder,Xiang Meng,Haoyue Wang

Decision trees are one of the most useful and popular methods in the machine learning toolbox. In this paper, we consider the problem of learning optimal decision trees, a combinatorial optimization problem that is challenging to solve at scale. A common approach in the literature is to use greedy heuristics, which may not be optimal. Recently there has been significant interest in learning optimal decision trees using various approaches (e.g., based on integer programming, dynamic programming) -- to achieve computational scalability, most of these approaches focus on classification tasks with binary features. In this paper, we present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees. Different from existing customized approaches, we consider both regression and classification tasks with continuous features. The basic idea underlying our approach is to split the search space based on the quantiles of the feature distribution -- leading to upper and lower bounds for the underlying optimization problem along the BnB iterations. Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.

翻译：决策树是机器学习工具箱中最有用和最受欢迎的方法之一。在本文中, 我们考虑了学习最佳决策树的问题, 这是一种在规模上难以解决的组合优化问题。文献中的一种共同做法是使用贪婪的超自然学, 这可能不是最佳的。最近, 人们对于学习最佳决策树非常感兴趣, 使用各种方法( 例如, 基于整数编程、动态编程) -- 实现计算可缩放性, 这些方法大多侧重于具有二进制特征的分类任务。在本文中, 我们提出了一种新的离散优化方法, 其基础是分支和边框( BnB), 以获得最佳决策树。与现有的定制方法不同, 我们所考虑的回归和分类任务有连续的特点。我们的方法的基本思想是, 将搜索空间分割在特性分布的微量上, 导致与 BnB 的外观一样, 导致最优化问题背后的上下限。我们提议的算法 Quant- BnB 显示与各种真实数据集中浅优化树的现有方法相比, 显著的速度加快。

0

相关内容

优化器

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于2-D空间离散数据的质量与产出的预测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于非统计量的无线传感网深盲度信号检测算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于系统仿真模拟的生态型灌区量化评价研究

国家自然科学基金

0+阅读 · 2013年12月31日

针对结构健康监测的无线传感网校准问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

海洋吡咯生物碱的设计、合成与活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

带拟周期强迫的非线性Hamilton偏微分方程拟周期解的存在性研究

国家自然科学基金

0+阅读 · 2011年12月31日

受干扰信号的自适应滤波及信号检测

国家自然科学基金

0+阅读 · 2011年12月31日

震源参数对基于有限断层长周期地震动数值模拟的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Meta-Agent交互链的作战系统建模研究

国家自然科学基金

8+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Patient-Specific Game-Based Transfer Method for Parkinson's Disease Severity Prediction

Arxiv

0+阅读 · 2022年8月12日

Improving Human Decision-Making with Machine Learning

Arxiv

0+阅读 · 2022年8月12日

Scalable and Sparsity-Aware Privacy-Preserving K-means Clustering with Application to Fraud Detection

Arxiv

0+阅读 · 2022年8月12日

Learning Computation Bounds for Branch-and-Bound Algorithms to k-plex Extraction

Learning Computation Bounds for Branch-and-Bound Algorithms to k-plex Extraction

Arxiv

0+阅读 · 2022年8月11日

Best Policy Identification in Linear MDPs

Arxiv

0+阅读 · 2022年8月11日

Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation

Arxiv

0+阅读 · 2022年8月11日

Robust methods for high-dimensional linear learning

Arxiv

0+阅读 · 2022年8月10日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《科研智能：人工智能赋能工业仿真研究报告（2025年）》

具身智能中的世界模型：全面综述

【NeurIPS2025】迈向开放世界的三维“物体性”学习

【博士论文】用于排序与扩散模型的安全、高效与鲁棒强化学习

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Patient-Specific Game-Based Transfer Method for Parkinson's Disease Severity Prediction

Arxiv

0+阅读 · 2022年8月12日

Improving Human Decision-Making with Machine Learning

Arxiv

0+阅读 · 2022年8月12日

Scalable and Sparsity-Aware Privacy-Preserving K-means Clustering with Application to Fraud Detection

Arxiv

0+阅读 · 2022年8月12日

Learning Computation Bounds for Branch-and-Bound Algorithms to k-plex Extraction

Learning Computation Bounds for Branch-and-Bound Algorithms to k-plex Extraction

Arxiv

0+阅读 · 2022年8月11日

Best Policy Identification in Linear MDPs

Arxiv

0+阅读 · 2022年8月11日

Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation

Arxiv

0+阅读 · 2022年8月11日

Robust methods for high-dimensional linear learning

Arxiv

0+阅读 · 2022年8月10日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Arxiv

11+阅读 · 2021年6月25日

Principal Neighbourhood Aggregation for Graph Nets

Principal Neighbourhood Aggregation for Graph Nets

Arxiv

17+阅读 · 2020年6月7日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

相关基金

基于2-D空间离散数据的质量与产出的预测方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于非统计量的无线传感网深盲度信号检测算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于系统仿真模拟的生态型灌区量化评价研究

国家自然科学基金

0+阅读 · 2013年12月31日

针对结构健康监测的无线传感网校准问题研究

国家自然科学基金

0+阅读 · 2012年12月31日

海洋吡咯生物碱的设计、合成与活性研究

国家自然科学基金

0+阅读 · 2011年12月31日

带拟周期强迫的非线性Hamilton偏微分方程拟周期解的存在性研究

国家自然科学基金

0+阅读 · 2011年12月31日

受干扰信号的自适应滤波及信号检测

国家自然科学基金

0+阅读 · 2011年12月31日

震源参数对基于有限断层长周期地震动数值模拟的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于Meta-Agent交互链的作战系统建模研究

国家自然科学基金

8+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员