The case-cohort study design bypasses resource constraints by collecting certain expensive covariates for only a small subset of the full cohort. Weighted Cox regression is the most widely used approach for analysing case-cohort data within the Cox model, but is inefficient. Alternative approaches based on multiple imputation and nonparametric maximum likelihood suffer from incompatibility and computational issues respectively. We introduce a novel Bayesian framework for case-cohort Cox regression that avoids the aforementioned problems. Users can include auxiliary variables to help predict the unmeasured expensive covariates with a prediction model of their choice, while the models for the nuisance parameters are nonparametrically specified and integrated out. Posterior sampling can be carried out using procedures based on the pseudo-marginal MCMC algorithm. The method scales effectively to large, complex datasets, as demonstrated in our application: investigating the associations between saturated fatty acids and type 2 diabetes using the EPIC-Norfolk study. As part of our analysis, we also develop a new approach for handling compositional data in the Cox model, leading to more reliable and interpretable results compared to previous studies. The performance of our method is illustrated with extensive simulations. The code used to produce the results in this paper can be found at https://github.com/andrewyiu/bayes_cc .
翻译:案例- cohort 研究设计绕过资源限制, 收集某些昂贵的共同变量, 仅用于整个组群的一小部分。 加权 Cox 回归是分析 Cox 模型中用于分析案件- cohort 数据的最广泛方法, 但效率低。 基于多重估算和非参数最大可能性的替代方法, 分别存在不兼容和计算问题。 我们为案例- cohort Cox 回归引入了一个新型的Bayesian 框架, 避免上述问题。 用户可以包括辅助变量, 帮助预测非计量的昂贵共同变量, 使用他们选择的预测模型预测非计量的昂贵的共变量, 而扰动参数的模型是非对称性指定和整合的。 可使用基于伪边际 MMC 算算法的程序进行离异性取样。 方法的尺度有效到大型、 复杂数据集, 正如我们的应用所示: 调查饱和脂肪酸与2型糖尿病之间的联系, 使用 EPIC- Norfol 研究 。 作为我们分析的一部分, 我们还开发一种新的方法, 处理Cox 模型/ 模型中的合成数据, 导致更可靠和解释结果。 在 ASyu AS be 上, 的模型中, 的模型中可以找到的模拟分析结果。