Randomized controlled trials (RCTs) are considered the gold standard for estimating the effects of interventions. Recent work has studied effect heterogeneity in RCTs by conditioning estimates on tabular variables such as age and ethnicity. However, such variables are often only observed near the time of the experiment and may fail to capture historical or geographical reasons for effect variation. When experiment units are associated with a particular location, satellite imagery can provide such historical and geographical information, yet there is no method which incorporates it for describing effect heterogeneity. In this paper, we develop such a method which estimates, using a deep probabilistic modeling framework, the clusters of images having the same distribution over treatment effects. We compare the proposed methods against alternatives in simulation and in an application to estimating the effects of an anti-poverty intervention in Uganda. A causal regularization penalty is introduced to ensure reliability of the cluster model in recovering Average Treatment Effects (ATEs). Finally, we discuss feasibility, limitations, and the applicability of these methods to other domains, such as medicine and climate science, where image information is prevalent. We make code for all modeling strategies publicly available in an open-source software package.
翻译:最近的工作通过调整对年龄和族裔等表列变量的估算,研究了生殖器毒性的影响异质性;然而,这些变量往往只在试验时间附近观察到,而且可能无法捕捉造成效果变化的历史或地理原因;当实验单位与某一特定地点相关时,卫星图像可以提供这种历史和地理信息,但是没有方法将其纳入描述效果异质性的方法;在本文件中,我们开发了这样一种方法,利用一种极有可能的模型框架,估计对治疗效果具有相同分布分布的图像组群;我们比较了在模拟中和在评估乌干达反贫困干预效果的应用中拟议的替代方法;引入了因果正规化处罚,以确保集成模型在恢复平均治疗效果(ATEs)中的可靠性;最后,我们讨论了这些方法的可行性、局限性以及这些方法对其他领域的适用性,例如药物和气候科学(图像信息十分普遍)等。我们用公开的软件包件中为所有建模战略制定了代码。