This study demonstrates the existence of a testable condition for the identification of the causal effect of a treatment on an outcome in observational data, which relies on two sets of variables: observed covariates to be controlled for and a suspected instrument. Under a causal structure commonly found in empirical applications, the testable conditional independence of the suspected instrument and the outcome given the treatment and the covariates has two implications. First, the instrument is valid, i.e. it does not directly affect the outcome (other than through the treatment) and is unconfounded conditional on the covariates. Second, the treatment is unconfounded conditional on the covariates such that the treatment effect is identified. We suggest tests of this conditional independence based on machine learning methods that account for covariates in a data-driven way and investigate their asymptotic behavior and finite sample performance in a simulation study. We also apply our testing approach to evaluating the impact of fertility on female labor supply when using the sibling sex ratio of the first two children as supposed instrument, which by and large points to a violation of our testable implication for the moderate set of socio-economic covariates considered.
翻译:这项研究表明,在确定观察数据结果的处理结果的因果关系方面,存在一个可测试的条件,该条件依赖于两组变量:观察到的可控共变数和可疑仪器。根据经验应用中常见的因果结构,可疑仪器的可测试的有条件独立性和结果以及治疗和共变数具有两种影响。首先,该工具是有效的,即它不会直接影响到结果(通过治疗以外的方式),并且没有以共变数为条件。第二,该待遇是没有根据的,条件是存在可识别处理效果的共变数。我们建议根据机器学习方法测试这种有条件的独立,以数据驱动的方式计算共变数,并在模拟研究中调查其无受控行为和有限的抽样性能。我们还采用测试方法,在将前两个孩子的男女同性比率作为假定工具时,评估生育率对女性劳动力供应的影响,这在很大程度上违反了我们对所考虑的中度社会经济共产体的可检验的影响。