Randomized experiments are widely used to estimate the causal effects of a proposed treatment in many areas of science, from medicine and healthcare to the physical and biological sciences, from the social sciences to engineering, to public policy and to the technology industry at large. Here, we consider situations where classical methods for estimating the total treatment effect on a target population are considerably biased due to confounding network effects, i.e., the fact that the treatment of an individual may impact their neighbors' outcomes, an issue referred to as network interference or as non-individualized treatment response. A key challenge in these situations, is that the network is often unknown, and difficult, or costly, to measure. In this paper, we characterize the limitations in estimating the total treatment effect without knowledge of the network that drives interference, assuming a potential outcomes model with heterogeneous additive network effects. This model encompasses a broad class of network interference sources, including spillover, peer effects, and contagion. Within this framework, we show that, surprisingly, given access to average historical baseline measurements prior to the experiment, we can develop a simple estimator and efficient randomized design that outputs an unbiased estimate with low variance. Our solution does not require knowledge of the underlying network structure, and it comes with statistical guarantees for a broad class of models. We believe our results are poised to impact current randomized experimentation strategies due to its ease of interpretation and implementation, alongside its provable theoretical insights under heterogeneous network effects.
翻译:人们广泛使用随机实验来估计从医学和医疗保健到物理和生物科学、从社会科学到工程学、公共政策和整个技术工业等许多科学领域拟议治疗的因果关系。在这里,我们考虑的是,由于网络效应混乱,即一个人的治疗可能会影响其邻居的结果,一个被称为网络干扰或非个体治疗反应的问题,在从医学和保健到物理和生物科学、从社会科学到工程学、公共政策到整个技术工业等许多领域拟议治疗的因果关系。我们考虑的是,在评估总治疗效果时,如果不了解造成干扰的网络,则我们描述在估计总治疗效果方面的局限性,假设一个具有多种添加效应的网络效应的潜在结果模型。这一模型包括广泛的网络干扰源,包括外溢效应、同侪效应和传染。在这个框架内,我们令人惊讶的是,鉴于在试验之前可以使用平均的历史基线测量,我们可以开发一个简单的估计和高效随机设计,以低差异的预测结果为衡量标准。在本文中,我们在评估总治疗效果时没有认识到,我们在估计总治疗效果方面的局限性,我们的解决方案并不要求对当前网络进行广泛的实验性结构进行广泛的分析。