Bayesian additive regression trees (BART) is a non-parametric method to approximate functions. It is a black-box method based on the sum of many trees where priors are used to regularize inference, mainly by restricting trees' learning capacity so that no individual tree is able to explain the data, but rather the sum of trees. We discuss BART in the context of probabilistic programming languages (PPLs), i.e. we present BART as a primitive that can be used to build probabilistic models rather than a standalone model. Specifically, we introduce a BART implementation extending PyMC, a Python library for probabilistic programming. We present a few examples of models that can be built using this probabilistic programming-oriented version of BART, discuss recommendations for sample diagnostics and selection of model hyperparameters, and finally we close with limitations of the current approach and future extensions.
翻译:Bayesian 添加回归树(BART) 是一种非参数性的方法来估计功能。 它是一种黑箱方法,它基于许多树的总和,这些树使用前科来规范推理,主要是限制树木的学习能力,以便没有单独的树木能够解释数据,而是树木的总和。 我们用概率性编程语言(PPLs)来讨论BART, 即我们把BART作为原始的, 用来建立概率模型, 而不是独立模型。 具体地说, 我们引入了一种BART实施法, 扩展PyMC, 一个用于概率性编程的Python图书馆。 我们举了几个模型的例子,可以用这种概率性编程型的BART来建立模型, 讨论用于抽样诊断和选择模型超参数的建议, 最后,我们接近了当前方法和未来扩展的局限性。