利用电子健康记录数据对异异种治疗影响进行强有力的估计 (Robust Estimation of Heterogeneous Treatment Effects using Electronic Health Record Data)

Estimation of heterogeneous treatment effects is an essential component of precision medicine. Model and algorithm-based methods have been developed within the causal inference framework to achieve valid estimation and inference. Existing methods such as the A-learner, R-learner, modified covariates method (with and without efficiency augmentation), inverse propensity score weighting, and augmented inverse propensity score weighting have been proposed mostly under the square error loss function. The performance of these methods in the presence of data irregularity and high dimensionality, such as that encountered in electronic health record (EHR) data analysis, has been less studied. In this research, we describe a general formulation that unifies many of the existing learners through a common score function. The new formulation allows the incorporation of least absolute deviation (LAD) regression and dimension reduction techniques to counter the challenges in EHR data analysis. We show that under a set of mild regularity conditions, the resultant estimator has an asymptotic normal distribution. Within this framework, we proposed two specific estimators for EHR analysis based on weighted LAD with penalties for sparsity and smoothness simultaneously. Our simulation studies show that the proposed methods are more robust to outliers under various circumstances. We use these methods to assess the blood pressure-lowering effects of two commonly used antihypertensive therapies.

翻译：对不同治疗效果的估算是精密医学的一个基本组成部分。模型和算法方法是在因果推断框架内开发的,以达到有效的估计和推断。现有的方法,如A-learner、R-learner、经修改的共变方法(有和没有效率增强)、反偏向评分加权法和增加反向偏向偏差评分加权法,主要在平方错误损失功能下提出。在数据异常和高度维度的情况下,这些方法的性能,例如电子健康记录(EHR)数据分析中遇到的方法,研究得较少。在这项研究中,我们描述了一种总公式,通过共同的得分函数将现有学习者中的许多人统一起来。新的公式允许纳入最小绝对偏差(LAD)的回归法和减少维度技术,以对付EHR数据分析中的挑战。我们表明,在一套温和的正常条件下,结果估测器的偏移分布不均匀。在这个框架内,我们建议基于加权LAD的血压分析的两种具体的估测测算方法,同时显示我们所使用的稳性方法。