This paper studies causal inference with observational data from a single large network. We consider a nonparametric model with interference in both potential outcomes and selection into treatment. Specifically, both stages may be the outcomes of simultaneous equations models, allowing for endogenous peer effects. This results in high-dimensional network confounding where the network and covariates of all units constitute sources of selection bias. In contrast, the existing literature assumes that confounding can be summarized by a known, low-dimensional function of these objects. We propose to use graph neural networks (GNNs) to adjust for network confounding. When interference decays with network distance, we argue that the model has low-dimensional structure that makes estimation feasible and justifies the use of shallow GNN architectures.
翻译:本文研究基于单一大型网络观测数据的因果推断问题。我们考虑一个在潜在结果和处理选择阶段均存在干扰效应的非参数模型。具体而言,这两个阶段可能都是联立方程模型的结果,允许内生同伴效应的存在。这导致了高维网络混杂问题,其中网络结构和所有单元的协变量共同构成了选择偏误的来源。相比之下,现有文献假设混杂可以通过这些对象的已知低维函数进行概括。我们提出使用图神经网络(GNNs)来调整网络混杂。当干扰效应随网络距离衰减时,我们认为该模型具有低维结构,这使得估计变得可行,并为使用浅层GNN架构提供了理论依据。