Tree-based regression and classification has become a standard tool in modern data science. Bayesian Additive Regression Trees (BART) has in particular gained wide popularity due its flexibility in dealing with interactions and non-linear effects. BART is a Bayesian tree-based machine learning method that can be applied to both regression and classification problems and yields competitive or superior results when compared to other predictive models. As a Bayesian model, BART allows the practitioner to explore the uncertainty around predictions through the posterior distribution. In this paper, we present new visualization techniques for exploring BART models. We construct conventional plots to analyze a model's performance and stability as well as create new tree-based plots to analyze variable importance, interaction, and tree structure. We employ Value Suppressing Uncertainty Palettes (VSUP) to construct heatmaps that display variable importance and interactions jointly using color scale to represent posterior uncertainty. Our new visualizations are designed to work with the most popular BART R packages available, namely BART, dbarts, and bartMachine. Our approach is implemented in the R package bartMan (BART Model ANalysis).
翻译:基于树木的回归和分类已成为现代数据科学的一个标准工具。 贝叶西亚 Additive Regrestition 树(BART)由于在应对互动和非线性效应方面的灵活性而特别受到广泛欢迎。 BART是一种基于巴伊西亚树的机器学习方法,可适用于回归和分类问题,与其他预测模型相比,可产生竞争或优异的结果。作为贝叶亚模型,BART允许执业者通过后座分布探索预测的不确定性。本文介绍了探索巴伊罗模型的新视觉化技术。我们建造传统地块,分析模型的性能和稳定性,并创建新的基于树的图,以分析不同的重要性、互动和树结构。我们使用价值抑制不确定性的Palette(VSUP)来构建热图,这些热图显示不同的重要性和相互作用,使用色标来代表后座不确定性。我们的新视觉化设计是为了与最受欢迎的BART R 包(即BART、dbarts和bartMachine)一起工作。我们的方法是在RAPRA模型中实施。