The problem of causal inference with panel data is a central econometric question. The following is a fundamental version of this problem: Let $M^*$ be a low rank matrix and $E$ be a zero-mean noise matrix. For a `treatment' matrix $Z$ with entries in $\{0,1\}$ we observe the matrix $O$ with entries $O_{ij} := M^*_{ij} + E_{ij} + \mathcal{T}_{ij} Z_{ij}$ where $\mathcal{T}_{ij} $ are unknown, heterogenous treatment effects. The problem requires we estimate the average treatment effect $\tau^* := \sum_{ij} \mathcal{T}_{ij} Z_{ij} / \sum_{ij} Z_{ij}$. The synthetic control paradigm provides an approach to estimating $\tau^*$ when $Z$ places support on a single row. This paper extends that framework to allow rate-optimal recovery of $\tau^*$ for general $Z$, thus broadly expanding its applicability. Our guarantees are the first of their type in this general setting. Computational experiments on synthetic and real-world data show a substantial advantage over competing estimators.
翻译:面板数据的因果推断问题是中心计量经济学问题之一。以下是该问题的一个基本版本:设$M^*$是一个低秩矩阵,$E$是零均值噪声矩阵。对于一个在$\{0,1\}$之间取值的“治疗”矩阵$Z$,我们观察到矩阵$O$,其条目为$O_{ij} := M^*_{ij} + E_{ij} + \mathcal{T}_{ij} Z_{ij}$,其中$\mathcal{T}_{ij}$是未知的、异质的治疗效应。该问题需要我们估计平均治疗效应$\tau^* := \sum_{ij} \mathcal{T}_{ij} Z_{ij} / \sum_{ij} Z_{ij}$。合成控制范例提供了一种方法来估计当$Z$仅在一行上有支持时的$\tau^*$。本文扩展该框架以允许对一般的$Z$进行率最优恢复$\tau^*$,从而广泛扩展了其适用性。我们的保证是在这一普适情形下首次实现的。对合成和实际数据的计算实验表明,相对于竞争估计量,我们的方法有很大的优势。