Since the average treatment effect (ATE) measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population. Assessing such risk is difficult, however, because any one individual treatment effect (ITE) is never observed so the 10% worst-affected cannot be identified, while distributional treatment effects only compare the first deciles within each treatment group, which does not correspond to any 10%-subpopulation. In this paper we consider how to nonetheless assess this important risk measure, formalized as the conditional value at risk (CVaR) of the ITE distribution. We leverage the availability of pre-treatment covariates and characterize the tightest-possible upper and lower bounds on ITE-CVaR given by the covariate-conditional average treatment effect (CATE) function. Some bounds can also be interpreted as summarizing a complex CATE function into a single metric and are of interest independently of being a bound. We then proceed to study how to estimate these bounds efficiently from data and construct confidence intervals. This is challenging even in randomized experiments as it requires understanding the distribution of the unknown CATE function, which can be very complex if we use rich covariates so as to best control for heterogeneity. We develop a debiasing method that overcomes this and prove it enjoys favorable statistical properties even when CATE and other nuisances are estimated by black-box machine learning or even inconsistently. Studying a hypothetical change to French job-search counseling services, our bounds and inference demonstrate a small social benefit entails a negative impact on a substantial subpopulation.
翻译:由于平均治疗效果(ATE)衡量社会福利的变化,即使是积极的,也有可能对大约10%的人口产生负面效应。但是,评估这种风险是困难的,因为从未观察到任何单个治疗效果(ITE),因此无法确定10%受影响最严重的个人治疗效果(ITE),而分配治疗效果只是比较了每个治疗组中的第一个十分位数,这与任何10%的子人口不相符。在本文件中,我们考虑如何评估这一重要的风险评估措施,正式确定为ITE分布的有条件风险值(CVaR)。我们利用预处理变值的可能性,评估这种风险风险是困难的。我们利用预处理前变异效应(CVaR)来评估这种风险风险风险,我们很难评估这种风险风险风险值,我们甚至难以进行随机化的实验,因为对于ITE-CVaR来说,最接近最紧凑的上下限和下限值。一些界限可以被解释为将复杂的CATE(CATE)功能归纳成单一标准,而不受约束。我们接着研究如何从数据中有效地估算这些界限和构建信任间隔。我们甚至还在随机化的实验中也具有挑战性,因为它需要理解一个最复杂的统计变变异化的方法,我们是如何理解一个未知的计算。我们是如何理解一个未知的变变异化的计算。