Randomized experiments are widely used to estimate causal effects of proposed "treatments" in domains spanning across physical sciences, social sciences, medicine, and technology industries. However, classical approaches to experimental design rely on critical independence assumptions that are violated when the outcome of an individual a may be affected by the treatment of another individual b, referred to as network interference. Under such network interference, naively using popular estimators and randomized experimental designs can result in significant bias and loss of efficiency. We consider a heterogeneous linear outcomes model that can capture network interference that arises from spillover, peer effects, and contagion. Under this model, we characterize the limitations and possibilities for estimating the total treatment effect, average direct treatment effect, and average interference effect. Given access to average historical baseline measurements prior to the experiment, we propose simple estimators and randomized designs that output unbiased estimates with low variance for these three estimands. Furthermore, our solution and statistical guarantees do not require knowledge of the underlying network structure, and thus can be used for scenarios where the network is unknown and complex. We believe our results are poised to impact current randomized experimentation strategies due to its ease of interpretation and implementation, alongside its provable statistical guarantees under heterogeneous network effects.
翻译:人们广泛使用随机实验来估计在物理科学、社会科学、医学和技术产业各领域拟议“治疗”的因果关系,然而,典型的实验设计方法依赖关键的独立性假设,而当一个人的结果可能受到另一个人b的治疗(称为网络干扰)的影响时,这种假设就违反了这种假设。在这种网络干扰下,天真地使用大众估计器和随机实验设计可能导致重大的偏差和效率丧失。我们认为,可以捕捉来自外溢、同侪效应和传染的网络干扰的多式线性结果模型。在这个模型下,我们确定了估计总治疗效果、平均直接治疗效应和平均干扰效应的局限性和可能性。鉴于在试验之前可以使用的平均历史基线测量结果,我们建议简单的估计和随机设计,得出对这三种估计值的低差异的不偏差估计数。此外,我们的解决方案和统计保证并不要求了解基本网络结构,因此可用于网络不为人所知和复杂的情况。我们认为,我们的结果将影响到目前随机实验战略,因为其解释和执行容易,同时具有统计性效果。