The problem of causal inference with panel data is a central econometric question. The following is a fundamental version of this problem: Let $M^*$ be a low rank matrix and $E$ be a zero-mean noise matrix. For a `treatment' matrix $Z$ with entries in $\{0,1\}$ we observe the matrix $O$ with entries $O_{ij} := M^*_{ij} + E_{ij} + \mathcal{T}_{ij} Z_{ij}$ where $\mathcal{T}_{ij} $ are unknown, heterogenous treatment effects. The problem requires we estimate the average treatment effect $\tau^* := \sum_{ij} \mathcal{T}_{ij} Z_{ij} / \sum_{ij} Z_{ij}$. The synthetic control paradigm provides an approach to estimating $\tau^*$ when $Z$ places support on a single row. This paper extends that framework to allow rate-optimal recovery of $\tau^*$ for general $Z$, thus broadly expanding its applicability. Our guarantees are the first of their type in this general setting. Computational experiments on synthetic and real-world data show a substantial advantage over competing estimators.
翻译:与小组数据相关的因果关系问题是一个中心的经济计量问题。以下是这一问题的一个基本版本:让美元成为低级矩阵,美元成为零平均值的噪音矩阵。对于“处理”矩阵,以0.1美元计入,我们观察“处理”矩阵,以0.1美元计入,以美元计入:=M ⁇ ij}+E ⁇ ij}+\mathcal{T ⁇ ij}美元,美元是未知的、异质治疗效果。问题要求我们估计平均处理效果$&T ⁇ }:========sum ⁇ ij} ⁇ ij}/\\\sumñij} ⁇ j}美元。合成控制范式提供了一种方法,当美元支持一行时估算$+\tau ⁇ $。本文扩展了这一框架,允许将美元作为一般Z$的超值回收率-最高值,从而广泛扩大了其适用性。我们保证了在合成实验中取得实质性数据的类型。