With the increasing adoption of electronic health records, there is an increasing interest in developing individualized treatment rules, which recommend treatments according to patients' characteristics, from large observational data. However, there is a lack of valid inference procedures for such rules developed from this type of data in the presence of high-dimensional covariates. In this work, we develop a penalized doubly robust method to estimate the optimal individualized treatment rule from high-dimensional data. We propose a split-and-pooled de-correlated score to construct hypothesis tests and confidence intervals. Our proposal utilizes the data splitting to conquer the slow convergence rate of nuisance parameter estimations, such as non-parametric methods for outcome regression or propensity models. We establish the limiting distributions of the split-and-pooled de-correlated score test and the corresponding one-step estimator in high-dimensional setting. Simulation and real data analysis are conducted to demonstrate the superiority of the proposed method.
翻译:随着越来越多地采用电子健康记录,人们越来越关心制定个人化治疗规则,这些规则根据病人的特征,从大型观测数据中建议治疗;然而,对于在高维共变情况下从这类数据中产生的这类规则,缺乏有效的推论程序;在这项工作中,我们开发了一种受处罚的双重稳健方法,从高维数据中估计最佳个人化治疗规则;我们建议采用一种分解和集中的分解分解法,以构建假设测试和信任间隔。我们的提案利用数据分解法,以克服调和参数估计的缓慢趋同率,例如结果回归或倾向模型的非参数方法。我们确定了分解和合并的分数相关得分测试和相应的高维度测算器的分布。我们进行了模拟和真实数据分析,以证明拟议方法的优越性。