This paper develops a novel stochastic tree ensemble method for nonlinear regression, which we refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, we demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). We also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, we demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.
翻译:本文为非线性回归开发了一种新型的随机树群共合法,我们称之为XBART, 简称为加速贝叶西亚回归树。通过将巴伊西亚模型的正规化和随机搜索战略与循环分割方法的计算效率技术相结合,新方法取得了最新水平的性能:在许多环境下,它比广泛使用的XGBoost算法更快和更准确。通过仔细的模拟研究,我们证明我们的新方法提供了对平均值的准确的准确、准确的估计,而且比流行的替代方法,如巴伊里亚、XGBoost和神经网络(使用Keras)要快得多。我们还证明了关于新算法的一些基本理论结果,包括模型的单一树本的一致性和由通则版本产生的Markov链的稳定性。此外,我们证明在Abtuart 配制的树木上初始化标准巴伊西亚添加性回归树链Markov Montecar(MC MC ) 大大改进了可靠的间隔范围,并减少了整个运行时间。