CODA: 利用多种数据源和有限结果校准最佳决策 (CODA: Calibrated Optimal Decision Making with Multiple Data Sources and Limited Outcome)

We consider the optimal decision-making problem in a primary sample of interest with multiple auxiliary sources available. The outcome of interest is limited in the sense that it is only observed in the primary sample. In reality, such multiple data sources may belong to heterogeneous studies and thus cannot be combined directly. This paper proposes a new framework to handle heterogeneous studies and address the limited outcome simultaneously through a novel calibrated optimal decision making (CODA) method, by leveraging the common intermediate outcomes in multiple data sources. Specifically, CODA allows the baseline covariates across different samples to have either homogeneous or heterogeneous distributions. Under a mild and testable assumption that the conditional means of intermediate outcomes in different samples are equal given baseline covariates and the treatment information, we show that the proposed CODA estimator of the conditional mean outcome is asymptotically normal and more efficient than using the primary sample solely. In addition, the variance of the CODA estimator can be easily obtained using the simple plug-in method due to the rate double robustness. Extensive experiments on simulated datasets demonstrate empirical validity and improved efficiency using CODA, followed by a real application to a MIMIC-III dataset as the primary sample with the auxiliary data from eICU.

翻译：我们考虑的是具有多种辅助来源的初步利益抽样中的最佳决策问题; 兴趣的结果有限,因为它仅在初级抽样中观察到; 事实上,这种多数据源可能属于不同研究,因此不能直接合并; 本文件提出一个新的框架,通过在多个数据源中利用共同中间结果,同时处理不同研究并处理有限的结果; 具体地说, CODA允许不同样品的基线共差具有同质或异质分布; 一种温和和可测试的假设,即不同样品的中间结果的有条件手段等于给定的基准共变和处理信息,我们表明,拟议的CODA对有条件平均结果的估测器比仅使用原始样本的简单调整最佳决策方法(CODA)要简单正常和有效。此外,CODA 估计器由于利率的两倍强度,很容易使用简单的插件方法获得差异。关于模拟数据集的广泛实验表明使用CODA的实验性有效性和效率提高,然后实际应用MIMIIII数据集作为辅助数据。