选择无效仪器可以改进门德利随机化的估计 (Selection of invalid instruments can improve estimation in Mendelian randomization)

Mendelian randomization (MR) is a widely-used method to identify causal links between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Current practice usually involves selecting only those genetic variants that are deemed to satisfy certain exclusion restrictions, in a bid to remove bias from unobserved confounding. Many more genetic variants may violate these exclusion restrictions due to unknown pleiotropic effects (i.e. direct effects on the outcome not via the exposure), but their inclusion could increase the precision of causal effect estimates at the cost of allowing some bias. We explore how to optimally tackle this bias-variance trade-off by carefully choosing from many weak and locally invalid instruments. Specifically, we study a focused instrument selection approach for publicly available two-sample summary data on genetic associations, whereby genetic variants are selected on the basis of how they impact the asymptotic mean square error of causal effect estimates. We show how different restrictions on the nature of pleiotropic effects have important implications for the quality of post-selection inferences. In particular, a focused selection approach under systematic pleiotropy allows for consistent model selection, but in practice can be susceptible to winner's curse biases. Whereas a more general form of idiosyncratic pleiotropy allows only conservative model selection, but offers uniformly valid confidence intervals. We propose a novel method to tighten honest confidence intervals through support restrictions on pleiotropy. We apply our results to several real data examples which suggest that the optimal selection of instruments does not only involve biologically-justified valid instruments, but additionally hundreds of potentially pleiotropic variants.

翻译：Mendelian随机化(MR)是一种广泛使用的方法,用于确定风险因素和疾病之间的因果关系。任何MR分析的一个基本部分是选择适当的遗传变异物作为工具变量。目前的做法通常只选择被认为符合某些排除限制的遗传变异物,目的是消除未观察到的混乱中的偏差。更多的遗传变异物可能由于未知的脾脏效应(即不是通过暴露对结果的直接作用)而违反这些排除限制,但纳入这些变异性可提高因果关系估计的精确性,而代价是允许某些偏差。我们探索如何从许多薄弱和本地无效的工具中仔细选择适当的偏差性偏差交易。具体地说,我们研究一种有重点的工具选择方法,用于公开提供的关于基因协会的两种抽样汇总数据,从而根据基因变异异性估计对无症状的中度平均错误(即对结果的直接影响)的影响来进行这些排除性限制。我们只能说明对脾脏作用估计的质量产生不同程度的限制,但具有重要的影响。我们探索如何最好地应用这种偏差性交易,从许多薄弱和无效的工具中仔细地选择。特别是,一个系统化的货币选择方法,在稳定的选择方法之下,一个稳定的选择一个稳定的选择方法使得一个稳定的选择结果能够产生一种稳定的选择。