Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (trained by adaptive recursive partitioning) for such purposes by demonstrating that they can fail to achieve polynomial rates of convergence in uniform norm, even with pruning. Instead, the convergence may be poly-logarithmic or, in some important special cases, such as honest regression trees, fail completely. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures, at the cost of losing interpretability and introducing two additional tuning parameters. The two hallmarks of random forests, subsampling and the random feature selection mechanism, are seen to each distinctively contribute to achieving nearly optimal performance for the model class considered.
翻译:重要的应用包括因果异的治疗效果和动态的政策决定,以及有条件的四分位回归和实验设计,在这些实验中,树的估测和推断是按共差的具体值进行的。在本文中,我们质疑决策树(经适应性循环分割法培训的)用于这些目的,表明它们即使在划线的情况下也不可能在统一规范中实现多元趋同率。相反,趋同可能是多元的,或者在某些重要的特殊情况下,例如诚实的回归树,完全失败。我们表明随机森林可以纠正这种情况,以失去可解释性为代价,将不良的树木转化为几乎最佳的程序,并引入另外两个调制参数。随机森林的两个标志,即子抽样和随机特征选择机制,都明显地有助于为所考虑的模型类别取得近乎最佳的业绩。