We present a Bayesian nonparametric model for conditional distribution estimation using Bayesian additive regression trees (BART). The generative model we use is based on rejection sampling from a base model. Typical of BART models, our model is flexible, has a default prior specification, and is computationally convenient. To address the distinguished role of the response in the BART model we propose, we further introduce an approach to targeted smoothing which is possibly of independent interest for BART models. We study the proposed model theoretically and provide sufficient conditions for the posterior distribution to concentrate at close to the minimax optimal rate adaptively over smoothness classes in the high-dimensional regime in which many predictors are irrelevant. To fit our model we propose a data augmentation algorithm which allows for existing BART samplers to be extended with minimal effort. We illustrate the performance of our methodology on simulated data and use it to study the relationship between education and body mass index using data from the medical expenditure panel survey (MEPS).
翻译:我们用贝叶西亚增殖回归树(BART)为有条件的分布估计提出了一种巴伊西亚非参数模型。我们使用的基因模型是以基准模型的拒绝抽样为基础的。BART模型的典型是灵活的,我们的模型具有默认的事先规格,并且在计算上很方便。为了解决我们提议的巴伊西亚模型中答复的显著作用,我们进一步引入了一种目标平滑的方法,而BART模型可能具有独立的兴趣。我们从理论上研究拟议的模型,并为后方分布提供了充分的条件,使之在接近于适应性优于许多预测数据无关的高度系统平滑等级的最小型最大最佳速度时集中。为了适应我们的模型,我们提议了一个数据增强算法,使现有的巴阿特取样员能够尽量扩大。我们用医疗支出小组调查(MEPS)的数据来说明我们模拟数据方法的绩效,并用它来研究教育与身体群指数之间的关系。