We present new insights into causal inference in the context of Heterogeneous Treatment Effects by proposing natural variants of Random Forests to estimate the key conditional distributions. To achieve this, we recast Breiman's original splitting criterion in terms of Wasserstein distances between empirical measures. This reformulation indicates that Random Forests are well adapted to estimate conditional distributions and provides a natural extension of the algorithm to multivariate outputs. Following the philosophy of Breiman's construction, we propose some variants of the splitting rule that are well-suited to the conditional distribution estimation problem. Some preliminary theoretical connections are established along with various numerical experiments, which show how our approach may help to conduct more transparent causal inference in complex situations.
翻译:我们通过提出随机森林的自然变体来估计关键的有条件分布。 为了达到这个目的,我们改写了布雷曼最初的分离标准,用瓦瑟斯坦在实证措施之间的距离来表示。这一改写表明,随机森林非常适合估计有条件分布,并为多种变式产出提供了算法的自然延伸。根据布雷曼的构思理念,我们提出了一些与有条件分布估计问题完全相适应的分离规则的变体。一些初步理论联系与各种数字实验一起建立,这些实验表明我们的方法如何有助于在复杂情况下进行更透明的因果关系推断。