Large observational data are increasingly available in disciplines such as health, economic and social sciences, where researchers are interested in causal questions rather than prediction. In this paper, we examine the problem of estimating heterogeneous treatment effects using non-parametric regression-based methods, starting from an empirical study aimed at investigating the effect of participation in school meal programs on health indicators. Firstly, we introduce the setup and the issues related to conducting causal inference with observational or non-fully randomized data, and how these issues can be tackled with the help of statistical learning tools. Then, we review and develop a unifying taxonomy of the existing state-of-the-art frameworks that allow for individual treatment effects estimation via non-parametric regression models. After presenting a brief overview on the problem of model selection, we illustrate the performance of some of the methods on three different simulated studies. We conclude by demonstrating the use of some of the methods on an empirical analysis of the school meal program data.
翻译:在卫生、经济和社会科学等学科中,研究人员对因果关系问题感兴趣,而不是对预测感兴趣,因此越来越多地可以获得大量观测数据。在本文件中,我们从旨在调查参加学校膳食方案对健康指标的影响的实证研究开始,从研究学校膳食方案对健康指标的影响的实证研究开始,研究使用非参数回归法估计不同治疗影响的问题。首先,我们介绍这种设置和与用观察或非随机数据进行因果关系有关的问题,以及如何在统计学习工具的帮助下解决这些问题。然后,我们审查并发展现有最新框架的统一分类,以便通过非参数回归法模型估算个别治疗影响。在对模式选择问题进行简要概述之后,我们介绍了三种不同的模拟研究的一些方法的绩效。我们最后通过展示对学校膳食方案数据进行实证分析的一些方法的使用。