Instrumental variable (IV) strategies are widely used in political science to establish causal relationships. However, the identifying assumptions required by an IV design are demanding, and it remains challenging for researchers to assess their validity. In this paper, we replicate 67 papers published in three top journals in political science during 2010-2022 and identify several troubling patterns. First, researchers often overestimate the strength of their IVs due to non-i.i.d. errors, such as a clustering structure. Second, the most commonly used t-test for the two-stage-least-squares (2SLS) estimates often severely underestimates uncertainty. Using more robust inferential methods, we find that around 19-30% of the 2SLS estimates in our sample are underpowered. Third, in the majority of the replicated studies, the 2SLS estimates are much larger than the ordinary-least-squares estimates, and their ratio is negatively correlated with the strength of the IVs in studies where the IVs are not experimentally generated, suggesting potential violations of unconfoundedness or the exclusion restriction. To help researchers avoid these pitfalls, we provide a checklist for better practice.
翻译:工具变量(IV)策略广泛用于政治科学中建立因果关系。然而,IV设计所要求的识别假设是苛刻的,研究人员很难评估它们的有效性。在本文中,我们复制了2010年至2022年间在三本顶级政治科学期刊上发表的67篇论文,并发现了几个令人不安的模式。首先,由于非独立同分布误差(如聚类结构),研究人员经常高估其IV的强度。其次,最常用的两阶段最小二乘(2SLS)估计的t检验通常严重低估不确定性。使用更强健的推断方法,我们发现我们样本中约19-30%的2SLS估计功率不足。第三,在大多数复制研究中,2SLS估计比普通最小二乘估计大得多,它们的比率与IV的强度呈负相关,在IV不是实验生成的研究中,这暗示了未被混淆或排除约束所违反的可能性。为帮助研究人员避免这些问题,我们提供了更好的实践清单。