非线性计量的最佳决定树 (Optimal Decision Trees for Nonlinear Metrics) - 专知论文

会员服务 ·

0

优化器 · 线性的 · 类别 · Performer · binary ·

2021 年 10 月 15 日

Optimal Decision Trees for Nonlinear Metrics

翻译：非线性计量的最佳决定树

Emir Demirović,Peter J. Stuckey

Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more samples of one class than the other. Recent optimal decision tree algorithms have shown remarkable progress in producing trees that are optimal with respect to linear criteria, such as accuracy, but unfortunately nonlinear metrics remain a challenge. To address this gap, we propose a novel algorithm based on bi-objective optimisation, which treats misclassifications of each binary class as a separate objective. We show that, for a large class of metrics, the optimal tree lies on the Pareto frontier. Consequently, we obtain the optimal tree by using our method to generate the set of all nondominated trees. To the best of our knowledge, this is the first method to compute provably optimal decision trees for nonlinear metrics. Our approach leads to a trade-off when compared to optimising linear metrics: the resulting trees may be more desirable according to the given nonlinear metric at the expense of higher runtimes. Nevertheless, the experiments illustrate that runtimes are reasonable for majority of the tested datasets.

翻译：非线性指标,如F1-线性指标、Matthews相关系数和Fowlkes-Mallows指数,常常被用来评价机器学习模型的性能,特别是当面临包含一个类比另一个类更多的样本的不平衡的数据集时。最近的最佳决策树算法显示,在生产符合线性标准的最佳树木方面取得了显著进展,例如准确性,但不幸的是,非线性指标仍然是一项挑战。为了解决这一差距,我们提出了一个基于双目标优化的新算法,将每个二进制类的分类错误作为一个单独的目标处理。我们表明,对于大类的计量标准,最佳树位于Pareto边界。因此,我们通过使用我们的方法生成所有非以势性树群的最佳树。据我们所知,这是对非线性指标进行可比较的最佳决策树的首个方法。我们的方法在与选择线性指标相比,导致的树木可能更适宜于给定的非线性多数度值。但是,根据给定的非线性多数值的实验,在较高的实验中,以较高的试验成本来说明。

0

相关内容

优化器

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

已删除

将门创投

8+阅读 · 2019年6月13日

Times Square sampling: an adaptive algorithm for free energy estimation

Arxiv

0+阅读 · 2021年12月9日

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

Arxiv

0+阅读 · 2021年12月8日

On the computation of a non-parametric estimator by convex optimization

Arxiv

0+阅读 · 2021年12月6日

On Complexity of 1-Center in Various Metrics

Arxiv

0+阅读 · 2021年12月6日

Online Bin Packing with Known T

Arxiv

0+阅读 · 2021年12月6日

Non-parametric interpretable score based estimation of heterogeneous treatment effects

Arxiv

0+阅读 · 2021年12月6日

Variable Selection in Regression-based Estimation of Dynamic Treatment Regimes

Arxiv

0+阅读 · 2021年12月3日

Efficient Continuous Manifold Learning for Time Series Modeling

Arxiv

0+阅读 · 2021年12月3日

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Arxiv

3+阅读 · 2021年1月13日

Nonlinear Metric Learning through Geodesic Polylinear Interpolation (ML-GPI)

Arxiv

4+阅读 · 2018年5月15日

VIP会员

文章信息

相关主题

相关VIP内容

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

【经典书】贝叶斯编程，378页pdf，Bayesian Programming

专知会员服务

250+阅读 · 2020年5月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

已删除

将门创投

8+阅读 · 2019年6月13日

相关论文

Times Square sampling: an adaptive algorithm for free energy estimation

Arxiv

0+阅读 · 2021年12月9日

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

Arxiv

0+阅读 · 2021年12月8日

On the computation of a non-parametric estimator by convex optimization

Arxiv

0+阅读 · 2021年12月6日

On Complexity of 1-Center in Various Metrics

Arxiv

0+阅读 · 2021年12月6日

Online Bin Packing with Known T

Arxiv

0+阅读 · 2021年12月6日

Non-parametric interpretable score based estimation of heterogeneous treatment effects

Arxiv

0+阅读 · 2021年12月6日

Variable Selection in Regression-based Estimation of Dynamic Treatment Regimes

Arxiv

0+阅读 · 2021年12月3日

Efficient Continuous Manifold Learning for Time Series Modeling

Arxiv

0+阅读 · 2021年12月3日

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Arxiv

3+阅读 · 2021年1月13日

Nonlinear Metric Learning through Geodesic Polylinear Interpolation (ML-GPI)

Arxiv

4+阅读 · 2018年5月15日

微信扫码咨询专知VIP会员