Many major works in social science employ matching to make causal conclusions, but different matches on the same data may produce different treatment effect estimates, even when they achieve similar balance or minimize the same loss function. We discuss reasons and consequences of this problem. We present evidence of this problem by replicating ten papers that use matching and we find that different popular matching algorithms produce inconsistent results. We introduce Matching Bounds: a finite-sample, nonstochastic method that allows analysts to know whether a matched sample that produces different results with the same levels of balance and overall match quality could be obtained from their data. We apply Matching Bounds to a replication of two studies and show that in one case results are robust to this issue and in another they are not.
翻译:许多社会科学领域的重要研究采用匹配法来做因果推断,但是即使达到了相似的平衡或最小化相同的损失函数,不同的数据匹配方法可能产生不同的处理效应估计结果。本文探讨了这个问题的原因和影响,并通过重复十篇使用匹配法的论文来证明该问题。我们发现,不同的流行性匹配算法会产生不一致的结果。我们介绍了一种有限样本、非随机的方法,称为“匹配限制法”,它允许分析师了解到,在相同的平衡和整体匹配质量水平下,一个产生不同结果的匹配样本是否可以从他们的数据中得到。我们将匹配限制法应用于两个研究的复制实验中,并表明在一种情况下,结果对这个问题是健壮的,而在另一种情况下,结果却没有那么健壮。