Regression trees and their ensemble methods are popular methods for nonparametric regression: they combine strong predictive performance with interpretable estimators. To improve their utility for locally smooth response surfaces, we study regression trees and random forests with linear aggregation functions. We introduce a new algorithm that finds the best axis-aligned split to fit linear aggregation functions on the corresponding nodes, and we offer a quasilinear time implementation. We demonstrate the algorithm's favorable performance on real-world benchmarks and in an extensive simulation study, and we demonstrate its improved interpretability using a large get-out-the-vote experiment. We provide an open-source software package that implements several tree-based estimators with linear aggregation functions.
翻译:回归树及其组合方法是非参数回归的常用方法:它们将强预测性能与可解释的估测器结合起来。为了提高它们对于当地平稳反应表面的实用性,我们研究回归树和随机森林,并使用线性聚合功能。我们引入了一种新的算法,找到最佳轴齐分法,在相应的节点上配置线性汇总功能,我们提供准线性时间执行。我们用现实世界基准和广泛的模拟研究来展示算法的优异性,我们用大规模退出投票实验来显示其更好的可解释性。我们提供了一套开源软件包,用线性集合功能执行数个基于树的估算器。