信号决定树的核心设置 (Coresets for Decision Trees of Signals) - 专知论文

会员服务 ·

0

优化器 · 标注 · LightGBM · tuning · sklearn ·

2021 年 10 月 7 日

Coresets for Decision Trees of Signals

翻译：信号决定树的核心设置

Ibrahim Jubran,Ernesto Evgeniy Sanches Shayda,Ilan Newman,Dan Feldman

A $k$-decision tree $t$ (or $k$-tree) is a recursive partition of a matrix (2D-signal) into $k\geq 1$ block matrices (axis-parallel rectangles, leaves) where each rectangle is assigned a real label. Its regression or classification loss to a given matrix $D$ of $N$ entries (labels) is the sum of squared differences over every label in $D$ and its assigned label by $t$. Given an error parameter $\varepsilon\in(0,1)$, a $(k,\varepsilon)$-coreset $C$ of $D$ is a small summarization that provably approximates this loss to \emph{every} such tree, up to a multiplicative factor of $1\pm\varepsilon$. In particular, the optimal $k$-tree of $C$ is a $(1+\varepsilon)$-approximation to the optimal $k$-tree of $D$. We provide the first algorithm that outputs such a $(k,\varepsilon)$-coreset for \emph{every} such matrix $D$. The size $|C|$ of the coreset is polynomial in $k\log(N)/\varepsilon$, and its construction takes $O(Nk)$ time. This is by forging a link between decision trees from machine learning -- to partition trees in computational geometry. Experimental results on \texttt{sklearn} and \texttt{lightGBM} show that applying our coresets on real-world data-sets boosts the computation time of random forests and their parameter tuning by up to x$10$, while keeping similar accuracy. Full open source code is provided.

翻译：$k$( 或$k$- tree) 是一个折叠的分隔符, 将一个矩阵 (2D- signal) 折叠成 $k\ geq 1 块矩阵( 轴- 双极矩形, 叶子), 每个矩形都配有一个真正的标签。它的回归或分类损失是给定矩阵 $N$ ( 标签) 的平方差数之和 $C 。鉴于一个错误参数 $\ varepsilon\ in ( 0. 1) 美元, $( k, 瓦雷斯伦) $( 美元) 美元( 美元), 美元( 美元) 美元( 美元) 核心 $( D$ ) 的折叠叠合。我们将这个折叠成一个小相近于 emph} 这样的树, 折叠成一个乘系数 $10\ varepslon 。尤其是, 美元最合的美元( 1varrequen) ral- tal tal $ Dal $ D$( $ D$) 美元) 。

0

相关内容

优化器

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

专知会员服务

52+阅读 · 2020年6月21日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

机器学习算法实践：决策树 (Decision Tree)

机器学习算法实践：决策树 (Decision Tree)

Python开发者

9+阅读 · 2017年7月17日

Approximating Length-Restricted Means under Dynamic Time Warping

Arxiv

0+阅读 · 2021年12月1日

Compatibility of Partitions with Trees, Hierarchies, and Split Systems

Arxiv

0+阅读 · 2021年11月30日

Modelling hetegeneous treatment effects by quantitle local polynomial decision tree and forest

Arxiv

0+阅读 · 2021年11月30日

A Chronology of Set Cover Inapproximability Results

Arxiv

0+阅读 · 2021年11月28日

Hypergraph Representation via Axis-Aligned Point-Subspace Cover

Arxiv

0+阅读 · 2021年11月26日

EOLANG and phi-calculus

Arxiv

0+阅读 · 2021年11月26日

Entropy-Based Approximation of the Secretary Problem

Arxiv

0+阅读 · 2021年11月25日

PSD Representations for Effective Probability Models

Arxiv

0+阅读 · 2021年11月24日

Optimal Counterfactual Explanations in Tree Ensembles

Arxiv

5+阅读 · 2021年6月25日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

超越深度学习：梯度提升机Gradient Boosting Machines (GBM)，73页ppt

专知会员服务

52+阅读 · 2020年6月21日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

机器学习算法实践：决策树 (Decision Tree)

机器学习算法实践：决策树 (Decision Tree)

Python开发者

9+阅读 · 2017年7月17日

相关论文

Approximating Length-Restricted Means under Dynamic Time Warping

Arxiv

0+阅读 · 2021年12月1日

Compatibility of Partitions with Trees, Hierarchies, and Split Systems

Arxiv

0+阅读 · 2021年11月30日

Modelling hetegeneous treatment effects by quantitle local polynomial decision tree and forest

Arxiv

0+阅读 · 2021年11月30日

A Chronology of Set Cover Inapproximability Results

Arxiv

0+阅读 · 2021年11月28日

Hypergraph Representation via Axis-Aligned Point-Subspace Cover

Arxiv

0+阅读 · 2021年11月26日

EOLANG and phi-calculus

Arxiv

0+阅读 · 2021年11月26日

Entropy-Based Approximation of the Secretary Problem

Arxiv

0+阅读 · 2021年11月25日

PSD Representations for Effective Probability Models

Arxiv

0+阅读 · 2021年11月24日

Optimal Counterfactual Explanations in Tree Ensembles

Arxiv

5+阅读 · 2021年6月25日

Factorized Graph Representations for Semi-Supervised Learning from Sparse Data

Arxiv

4+阅读 · 2020年3月5日

微信扫码咨询专知VIP会员