The Consent-to-Contact (C2C) registry at the University of California, Irvine collects data from community participants to aid in the recruitment to clinical research studies. Self-selection into the C2C likely leads to bias due in part to enrollees having more years of education relative to the US general population. Salazar et al. (2020) recently used the C2C to examine associations of race/ethnicity with participant willingness to be contacted about research studies. To address questions about generalizability of estimated associations we estimate propensity for self-selection into the convenience sample weights using data from the National Health and Nutrition Examination Survey (NHANES). We create a combined dataset of C2C and NHANES subjects and compare different approaches (logistic regression, covariate balancing propensity score, entropy balancing, and random forest) for estimating the probability of membership in C2C relative to NHANES. We propose methods to estimate the variance of parameter estimates that account for uncertainty that arises from estimating propensity weights. Simulation studies explore the impact of propensity weight estimation on uncertainty. We demonstrate the approach by repeating the analysis by Salazar et al. with the deduced propensity weights for the C2C subjects and contrast the results of the two analyses. This method can be implemented using our estweight package in R available on GitHub.
翻译:加利福尼亚大学同意接触(C2C)登记处Irvine收集社区参与者的数据,以协助为临床研究征聘人员。自我选择C2C可能会导致偏向,部分是由于注册者受教育年数比美国一般人口多。Salazar等人(202020年)最近利用C2C检查种族/族裔协会,参与者愿意就研究研究进行联系。为了解决关于估计协会的可概括性问题,我们利用国家健康和营养调查(NHANES)的数据估计自我选择倾向进入方便抽样加权数。我们创建C2C和NHANES这两个主题的综合数据集,比较不同方法(回归、差异平衡性偏差、温和随机森林),以评估C2C相对于NHANS的加入可能性。我们提出了估算参数估计数差异的方法,以估计可因估计体重重量而出现的不确定性。模拟研究探讨了对不确定性的耐敏度估计对C2和NHANES两个主题的影响。我们用SAR2模型进行对比分析,我们用SAL2模型分析的方法,用SAL2号模型进行对比分析。