In this paper, we propose a data-adaptive empirical likelihood-based approach for treatment effect estimation and inference, which overcomes the obstacle of the traditional empirical likelihood-based approaches in the high-dimensional setting by adopting penalized regression and machine learning methods to model the covariate-outcome relationship. In particular, we show that our procedure successfully recovers the true variance of Zhang's treatment effect estimator (Zhang, 2018) by utilizing a data-splitting technique. Our proposed estimator is proved to be asymptotically normal and semiparametric efficient under mild regularity conditions. Simulation studies indicate that our estimator is more efficient than the estimator proposed by Wager et al. (2016) when random forests are employed to model the covariate-outcome relationship. Moreover, when multiple machine learning models are imposed, our estimator is at least as efficient as any regular estimator with a single machine learning model. We compare our method to existing ones using the ACTG175 data and the GSE118657 data, and confirm the outstanding performance of our approach.
翻译:在本文中,我们提出了一个数据适应性的经验概率估计和推断方法,通过采用惩罚性回归和机器学习方法来模拟共变结果关系,克服了高维环境中传统的经验概率方法的障碍。特别是,我们表明,我们的程序通过使用数据分享技术成功地恢复了张氏治疗效果估计值(张,2018年)的真正差异。我们提议的估测仪证明,在温和的常规条件下,其正常和半对称效率是有限的。模拟研究表明,当随机森林被用于模拟共变结果关系时,我们的估测仪比Wager等人提议的估测仪更有效。此外,在采用多机学习模型时,我们的估测仪与任何使用单一机器学习模型的定期估测仪一样效率最低。我们用ACTG175数据和GE118657数据将我们的方法与现有方法进行比较,并证实我们方法的出色表现。