潜入决策树和森林:理论示范 (Dive into Decision Trees and Forests: A Theoretical Demonstration)

Based on decision trees, many fields have arguably made tremendous progress in recent years. In simple words, decision trees use the strategy of "divide-and-conquer" to divide the complex problem on the dependency between input features and labels into smaller ones. While decision trees have a long history, recent advances have greatly improved their performance in computational advertising, recommender system, information retrieval, etc. We introduce common tree-based models (e.g., Bayesian CART, Bayesian regression splines) and training techniques (e.g., mixed integer programming, alternating optimization, gradient descent). Along the way, we highlight probabilistic characteristics of tree-based models and explain their practical and theoretical benefits. Except machine learning and data mining, we try to show theoretical advances on tree-based models from other fields such as statistics and operation research. We list the reproducible resource at the end of each method.

翻译：在决策树的基础上,许多领域近年来取得了巨大的进步。简言之,决策树采用“分解和征服”战略,将关于投入特征和标签之间依赖性的复杂问题分为小领域。虽然决策树历史悠久,但最近的进展大大改善了它们在计算广告、建议系统、信息检索等方面的表现。我们引入了共同的树基模型(例如巴耶西亚CART、巴耶西亚回归样板)和培训技术(例如混合整数编程、交替优化、梯度下降 ) 。与此同时,我们突出植树模型的概率特征,并解释其实际和理论效益。除了机器学习和数据挖掘之外,我们试图展示统计和作业研究等其他领域基于树基模型的理论进步。我们列出了每种方法结尾的可复制资源。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日