We consider the task of estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain. The observational data is assumed to be confounded and hence without further assumptions, this dataset alone cannot be used for causal inference. Also, only a short-term version of the primary outcome variable of interest is observed in the experimental data, and hence, this dataset alone cannot be used for causal inference either. In a recent work, Athey et al. (2020) proposed a method for systematically combining such data for identifying the downstream causal effect in view. Their approach is based on the assumptions of internal and external validity of the experimental data, and an extra novel assumption called latent unconfoundedness. In this paper, we first review their proposed approach and discuss the latent unconfoundedness assumption. Then we propose two alternative approaches for data fusion for the purpose of estimating average treatment effect as well as the effect of treatment on the treated. Our first proposed approach is based on assuming equi-confounding bias for the short-term and long-term outcomes. Our second proposed approach is based on the proximal causal inference framework, in which we assume the existence of an extra variable in the system which is a proxy of the latent confounder of the treatment-outcome relation.
翻译:我们考虑的是利用观测领域和实验领域的数据来估计处理变量对长期结果变量的因果关系的任务。观测数据假定是混乱的,因此没有进一步假设,这一数据集不能单独用于因果关系推断。此外,试验数据只看到主要结果变量的短期版本,因此,这一数据集也不能单独用于因果关系推断。在最近的一项工作中,AYEM et al. (202020年) 提出了一种方法,系统合并这些数据,以确定下游因果效应。它们的方法基于实验数据内部和外部有效性的假设,以及被称为潜在无根据的另外一种新假设。我们首先审查它们的拟议方法,讨论潜在的无根据假设假设假设假设。然后我们提出两种数据融合方法,以估计平均治疗效应和治疗对所处理者的影响为目的。我们的第一个拟议方法基于假设对短期和长期结果的偏差的假设。我们的第二个拟议方法基于实验数据内部和外部有效性的假设,以及一个被称为潜在无根据的概率框架的变数。我们提出的第二个方法基于一种变数,即我们所假设的因果关系框架的变数。