Consider a setting where there are $N$ heterogeneous units (e.g., individuals, sub-populations) and $D$ interventions (e.g., socio-economic policies). Our goal is to learn the potential outcome associated with every intervention on every unit (i.e., $N \times D$ causal parameters). Towards this, we present a causal framework, synthetic interventions (SI), to infer these $N \times D$ causal parameters while only observing each of the $N$ units under at most two interventions, independent of $D$. This can be significant as the number of interventions, i.e, level of personalization, grows. Importantly, our estimator also allows for latent confounders that determine how interventions are assigned. Theoretically, under a novel tensor factor model across units, measurements, and interventions, we formally establish an identification result for each of these $N \times D$ causal parameters and establish finite-sample consistency and asymptotic normality of our estimator. The estimator is furnished with a data-driven test to verify its suitability. Empirically, we validate our framework through both experimental and observational case studies; namely, a large-scale A/B test performed on an e-commerce platform, and an evaluation of mobility restriction on morbidity outcomes due to COVID-19. We believe this has important implications for program evaluation and the design of data-efficient RCTs with heterogeneous units and multiple interventions.
翻译:我们的目标是了解每个单位每次干预(即美元=乘以美元因果参数)的潜在结果。为此,我们提出了一个因果框架、合成干预(SI),以推断这些因果参数,而仅观察每个单位在最多两次干预(不以美元为单位)下每个单位(美元)的因果参数,而最多两次干预(如个人、亚人口)和美元(美元)的因果参数。这可以意义重大,因为干预措施的数量,即个人化的程度在增加。重要的是,我们的估算器还允许潜在的混淆者了解每个单位每次干预(即,美元=乘以美元=因果参数)。 从理论上讲,根据一个新的单位、计量和干预措施的强因因数模型,我们正式为每个单位确定因果系数参数,同时确定每个单位在最多两次干预(不以美元计为单位)的因果参数,并设定有限和无损正常的因果标准。我们估算器配备了一个数据驱动的标准化测试,用以核实其准确性观测结果的多度测试,即测试A级设计结果,通过测试A级测试模型验证一个测试结果框架。