An endeavor central to precision medicine is predictive biomarker discovery; they define patient subpopulations which stand to benefit most, or least, from a given treatment. The identification of these biomarkers is often the byproduct of the related but fundamentally different task of treatment rule estimation. Using treatment rule estimation methods to identify predictive biomarkers in clinical trials where the number of covariates exceeds the number of participants often results in high false discovery rates. The higher than expected number of false positives translates to wasted resources when conducting follow-up experiments for drug target identification and diagnostic assay development. Patient outcomes are in turn negatively affected. We propose a variable importance parameter for directly assessing the importance of potentially predictive biomarkers, and develop a flexible semiparametric inference procedure for this estimand. We prove that our estimator is double-robust and asymptotically linear under loose conditions on the data-generating process, permitting valid inference about the importance metric. The statistical guarantees of the method are verified in a thorough simulation study representative of randomized control trials with moderate and high-dimensional covariate vectors. Our procedure is then used to discover predictive biomarkers from among the tumor gene expression data of metastatic renal cell carcinoma patients enrolled in recently completed clinical trials. We find that our approach more readily discerns predictive from non-predictive biomarkers than procedures whose primary purpose is treatment rule estimation. An open-source software implementation of the methodology, the uniCATE R package, is briefly introduced.
翻译:精确医学的核心是预测性生物标志性发现;它们定义了病人亚群群,这些子群最或最或最不能够从特定治疗中受益。这些生物标志的确定往往是相关但根本不同的治疗规则估计任务的副产品。使用治疗规则估计方法,在共同变数超过参与者人数的临床试验中确定预测性生物标志性标志性生物标志性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性方法,随后用于于于于于于于于于于于最近前于于于于于于于于于于于于于于于最近在数据性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标性指标