Conventional standard errors reflect the fact that the observed data is sampled from an infinite super-population, but this approach to uncertainty may be unnatural in settings where all units in the population are observed (e.g. all 50 U.S. states). In such settings, it may be more natural to view the uncertainty as design-based, i.e. arising from the stochastic assignment of treatment. This paper develops a design-based framework for uncertainty that is suitable for analyzing "quasi-experimental" settings commonly studied in economics. A key feature of our framework is that each unit has an idiosyncratic probability of receiving treatment, but these idiosyncratic probabilities are unknown to the researcher. We derive conditions under which difference-in-differences (DiD) and related estimators are unbiased for an interpretable causal estimand. When the DiD estimator is unbiased, conventional confidence intervals are valid but potentially conservative in large populations. An interesting feature of our setting is that conventional standard errors tend to be more conservative when treatment probabilities differ across units, which helps to mitigate undercoverage from bias. As a result, conventional confidence intervals for DiD can potentially still have correct coverage even if the design-based analog to parallel trends does not hold exactly. Our results also have implications for the appropriate level to cluster standard errors and for the analysis of instrumental variables.
翻译:常规标准错误反映了一个事实,即观测到的数据是从无限的超人口群中抽样,但这种处理不确定性的方法在观察人口单位的所有单位(例如所有50个美国州)都观察到的环境中可能是不正常的。在这种环境下,将不确定性视为基于设计、即由治疗的随机分配产生的不确定性可能比较自然。本文为不确定性制定了一个设计框架,适合于分析经济学中通常研究的“准实验”环境。我们框架的一个主要特征是,每个单位接受治疗的可能性不同,但研究者并不了解这些独特的综合概率。在这种环境中,我们得出不同差异(DID)和相关估算者对可解释的因果关系估计值是不带偏见的。当DiD估计值是公正的时,常规信任期是有效的,但有可能保守。我们设置的一个有趣的特征是,当治疗概率不同时,每个单位的典型标准错误往往比较保守,但是这些特殊性概率对研究者来说是未知的。在这种条件下,差异(D)和相关估计值的不确定性对于可正确理解性趋势,对于常规标准偏差的结果也是适当的。