In the study of causal inference, statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a large number of features in estimation. To make efforts to address the issue, we propose a score-based framework for estimating the Conditional Average Treatment Effect (CATE) function in this paper. The framework integrates two components: (i) leverage the joint use of propensity and prognostic scores in a matching algorithm to obtain a proxy of the heterogeneous treatment effects for each observation, (ii) utilize non-parametric regression trees to construct an estimator for the CATE function conditioning on the two scores. The method naturally stratifies treatment effects into subgroups over a 2d grid whose axis are the propensity and prognostic scores. We conduct benchmark experiments on multiple simulated data and demonstrate clear advantages of the proposed estimator over state of the art methods. We also evaluate empirical performance in real-life settings, using two observational data from a clinical trial and a complex social survey, and interpret policy implications following the numerical results.
翻译:在因果推断研究中,统计学家表示对估计和分析观察研究中因果关系的异异性的兴趣越来越大,然而,在为治疗效果开发一个理想的估算器时,通常在准确性和可解释性之间存在着权衡,特别是在估算中有许多特点的情况下,在准确性和可解释性之间,在为治疗效果开发一个理想的估算器方面,特别是在估算中有许多特点的情况下,通常存在着权衡。为了努力解决这一问题,我们提议了一个基于分数的框架,用于估算本文件中条件平均治疗效果的分数。框架包括两个组成部分:(一) 利用匹配算法中混合使用适应性和预测分数的适应性和预测分数,以获得每种观测结果的替代物,(二) 利用非参数回归图谱树来构建CATE函数的估算器,以两个分数为条件。为了解决这一问题,我们建议采用一个基于2个电网的分组的治疗效果,其轴线是偏向性和预测分。我们对多个模拟数据进行基准实验,并表明拟议的估测算器对艺术状况的明显优势。我们还评估现实环境中的经验性表现,使用两次临床调查,从复杂的临床结果和从两次对复杂的实验结果作出解释。