Instrumental variables (IVs) are extensively used to estimate treatment effects when the treatment and outcome are confounded by unmeasured confounders; however, weak IVs are often encountered in empirical studies and may cause problems. Many studies have considered building a stronger IV from the original, possibly weak, IV in the design stage of a matched study at the cost of not using some of the samples in the analysis. It is widely accepted that strengthening an IV tends to render nonparametric tests more powerful and will increase the power of sensitivity analyses in large samples. In this article, we re-evaluate this conventional wisdom to bring new insights into this topic. We consider matched observational studies from three perspectives. First, we evaluate the trade-off between IV strength and sample size on nonparametric tests assuming the IV is valid and exhibit conditions under which strengthening an IV increases power and conversely conditions under which it decreases power. Second, we derive a necessary condition for a valid sensitivity analysis model with continuous doses. We show that the $\Gamma$ sensitivity analysis model, which has been previously used to come to the conclusion that strengthening an IV increases the power of sensitivity analyses in large samples, does not apply to the continuous IV setting and thus this previously reached conclusion may be invalid. Third, we quantify the bias of the Wald estimator with a possibly invalid IV under an oracle and leverage it to develop a valid sensitivity analysis framework; under this framework, we show that strengthening an IV may amplify or mitigate the bias of the estimator, and may or may not increase the power of sensitivity analyses. We also discuss how to better adjust for the observed covariates when building an IV in matched studies.
翻译:当治疗和结果被不测的混淆者混淆时,乐器变量(IVs)被广泛用来估计治疗效果;然而,当治疗和结果被不测的混乱者混淆时,则大量使用乐器变量(IVs)来估计治疗效果;然而,在经验研究中常常遇到薄弱的IV,并可能造成问题;许多研究认为,在一项匹配研究的设计阶段,在最初的、可能是薄弱的四类试验中,在设计阶段,在设计阶段,以不使用分析中的某些样本为代价,将建立更强的IV;人们广泛接受,加强四类试验往往会使非对数的测试更强大,并将提高大型样品的敏感性分析能力。 在本篇文章中,我们重新评价这一常规智慧,以便从三个角度对这个主题进行新的认识。首先,我们评估四类的强度和样本的抽样的对比规模之间的权衡取舍,假定四类试验有效,并显示加强四类试验的力量和四类分析的结果。