Selection bias is a common concern in epidemiologic studies. In the literature, selection bias is often viewed as a missing data problem. Popular approaches to adjust for bias due to missing data, such as inverse probability weighting, rely on the assumption that data are missing at random and can yield biased results if this assumption is violated. In observational studies with outcome data missing not at random, Heckman's sample selection model can be used to adjust for bias due to missing data. In this paper, we review Heckman's method and a similar approach proposed by Tchetgen Tchetgen and Wirth (2017). We then discuss how to apply these methods to Mendelian randomization analyses using individual-level data, with missing data for either the exposure or outcome or both. We explore whether genetic variants associated with participation can be used as instruments for selection. We then describe how to obtain missingness-adjusted Wald ratio, two-stage least squares and inverse variance weighted estimates. The two methods are evaluated and compared in simulations, with results suggesting that they can both mitigate selection bias but may yield parameter estimates with large standard errors in some settings. In an illustrative real-data application, we investigate the effects of body mass index on smoking using data from the Avon Longitudinal Study of Parents and Children.
翻译:选择选择偏好是流行病学研究中常见的问题。在文献中,选择偏好往往被视为一个缺失的数据问题。由于缺少数据而调整偏差的流行方法,例如反概率权重,基于数据随机缺失的假设,如果这一假设被违反,可以产生偏差结果。在对结果数据不随机缺失的观察研究中,Heckman的样本选择模式可以用来调整因数据缺失而产生的偏差。在本文中,我们审查Heckman的方法和Tchetgen Tchetgen和Wirth(2017年)提出的类似方法。然后我们讨论如何将这些方法应用于使用单级数据进行的Mendelian随机化分析,缺少数据是暴露或结果或结果或两者的。我们探讨与参与有关的基因变异是否可用作选择工具。然后我们描述如何获得缺失调整的沃尔德比率、两阶段最低方和反差加权估计数。在模拟中评估和比较这两种方法,其结果表明它们既可以减少选择偏差,但可以产生参数估计,在某种情况下,使用单级数据,而缺乏关于暴露或结果的数据。我们研究与参与相关的实际研究。我们用Avial Avial Avo 研究所的模型研究。我们用一个指标研究。