We attempt to give a unifying view of the various recent attempts to (i) improve the interpretability of tree-based models and (ii) debias the the default variable-importance measure in random Forests, Gini importance. In particular, we demonstrate a common thread among the out-of-bag based bias correction methods and their connection to local explanation for trees. In addition, we point out a bias caused by the inclusion of inbag data in the newly developed explainable AI for trees algorithms.
翻译:我们试图以统一的观点看待最近为以下目的所作的各种尝试:(一) 改进树本模型的解释性,(二) 降低随机森林中默认的可变重要性措施,即基尼的重要性,特别是,我们显示了基于包包的偏见纠正方法及其与当地对树木的解释之间的联系之间的一条共同线,此外,我们指出,由于在新开发的可解释的树算法的人工智能中包含袋数据,造成了一种偏差。