Bayesian additive regression trees (BART) is a non-parametric method to approximate functions. It is a black-box method based on the sum of many trees where priors are used to regularize inference, mainly by restricting trees' learning capacity so that no individual tree is able to explain the data, but rather the sum of trees. We discuss BART in the context of probabilistic programming languages (PPLs), specifically we introduce a BART implementation extending PyMC, a Python library for probabilistic programming. We present a few examples of models that can be built using this probabilistic programming-oriented version of BART, discuss recommendations for sample diagnostics and selection of model hyperparameters, and finally we close with limitations of the current approach and future extensions.
翻译:Bayesian 添加性回归树(BART)是近似功能的一种非参数方法,是一种黑箱方法,它基于许多树的总和,这些树使用前科来规范推理,主要是限制树木的学习能力,以便没有单独的树木能够解释数据,而是树木的总和。我们在概率编程语言(PPLs)的背景下讨论BART,具体地说,我们引入了一种BART实施扩展PyMC的BYMC,这是用于概率编程的Python图书馆。我们举了几个例子,说明可以利用这种概率性编程型的BART来建立模型,讨论关于样本诊断和选择模型超参数的建议,最后,我们结束了目前做法和未来扩展的局限性。