Representative risk estimation is fundamental to clinical decision-making. However, risks are often estimated from non-representative epidemiologic studies, which usually underrepresent minorities. "Model-based" methods use population registries to improve externally validity of risk estimation but assume hazard ratios (HR) are generalizable from samples to the target finite population. "Pseudoweighting" methods improve representativeness of studies by using an external probability-based survey as the reference, but the resulting estimators can be biased due to propensity model misspecification or inefficient due to variable pseudoweights or small sample sizes of minorities in the cohort and/or survey. We propose a two-step pseudoweighting procedure that poststratifies the event rates among age/race/sex strata in the pseudoweighted cohort to the population rates to produce efficient and robust pure risk estimation (i.e., a cause-specific absolute risk in the absence of competing events). For developing an all-cause mortality risk model representative for the US, our findings suggest that HRs for minorities are not generalizable, and that surveys can have inadequate numbers of events for minorities. Poststratification on event rates is crucial for obtaining reliable risk estimation for minority subgroups.
翻译:代表性的风险估计对临床决策至关重要。然而,风险往往是从非代表性的流行病学研究中估计得出的,这通常会歧视少数族裔。"基于模型"的方法使用人口登记表来提高风险估计的外部有效性,但是假定风险比率(HR)从样本泛化到目标有限人群。"伪加权"方法通过使用外部基于概率的调查作为参考来改善研究的代表性,但是由于倾向模型规范不正确或少数族裔在队列和/或调查中的小样本大小等原因,结果估计可以出现偏差。我们提出了一种两步伪加权过程,将伪加权队列中年龄/种族/性别分层中的事件率后分层到人口率,以产生高效且健壮的纯风险评估(即在不存在竞争环境中的特定原因的绝对风险) 。针对美国制定全因死亡风险模型,我们的研究发现少数族裔的HR不具有泛化能力,调查在少数族裔中可能出现事件的不足。基于事件率的后分层对于获得少数族裔亚组群可靠的风险评估至关重要。