Bayesian additive regression trees (BART) is a semi-parametric regression model offering state-of-the-art performance on out-of-sample prediction. Despite this success, standard implementations of BART typically provide inaccurate prediction and overly narrow prediction intervals at points outside the range of the training data. This paper proposes a novel extrapolation strategy that grafts Gaussian processes to the leaf nodes in BART for predicting points outside the range of the observed data. The new method is compared to standard BART implementations and recent frequentist resampling-based methods for predictive inference. We apply the new approach to a challenging problem from causal inference, wherein for some regions of predictor space, only treated or untreated units are observed (but not both). In simulations studies, the new approach boasts superior performance compared to popular alternatives, such as Jackknife+.
翻译:贝叶西亚叠加回归树(BART)是一种半参数回归模型,提供非抽样预测的最先进性能。尽管如此,BART的标准实施通常在培训数据范围以外的地方提供不准确的预测和过于狭窄的预测间隔。本文提出一种新的外推策略,将高萨过程移植到BART的叶节点,以预测观察到的数据范围以外的点。新方法与标准的BART实施和最近常见的预测推理方法相比。我们采用了新的方法,解决了因果推论引起的一个具有挑战性的问题,在这种推论中,某些预测空间区域只观察了经过处理或未经处理的单位(但并非两者兼而有)。在模拟研究中,新办法与流行的替代方法相比,例如Jackknife+,表现优。