It is becoming increasingly common for researchers to consider incorporating external information from large studies to improve the accuracy of statistical inference instead of relying on a modestly sized dataset collected internally. With some new predictors only available internally, we aim to build improved regression models based on individual-level data from an "internal" study while incorporating summary-level information from "external" models. We propose a meta-analysis framework along with two weighted estimators as the composite of empirical Bayes estimators, which combines the estimates from the different external models. The proposed framework is flexible and robust in the ways that (i) it is capable of incorporating external models that use a slightly different set of covariates; (ii) it can identify the most relevant external information and diminish the influence of information that is less compatible with the internal data; and (iii) it nicely balances the bias-variance trade-off while preserving the most efficiency gain. The proposed estimators are more efficient than the naive analysis of the internal data and other naive combinations of external estimators.
翻译:研究人员越来越普遍地考虑将大型研究的外部信息纳入其中,以提高统计推断的准确性,而不是依靠内部收集的小规模数据集。由于有些新的预测器只能内部提供,我们的目标是根据“内部”研究中的个人数据建立更好的回归模型,同时将“外部”模型的汇总信息纳入其中。我们提出了一个元分析框架以及两个加权估计器,作为经验性贝耶斯估计器的组合,将不同外部模型的估计数综合在一起。拟议的框架灵活而有力,其方式是:(一) 能够纳入使用一套略有不同的共变数的外部模型;(二) 能够确定最相关的外部信息,减少与内部数据不相容的信息的影响;(三) 平衡偏差交易,同时保留效率最高的收益。拟议的估计器比对内部数据和其他外部估计器的天真组合进行天真分析更为有效。