Doubly 强力直接学习,以估计平均平均治疗效果 (Doubly Robust Direct Learning for Estimating Conditional Average Treatment Effect)

Inferring the heterogeneous treatment effect is a fundamental problem in the sciences and commercial applications. In this paper, we focus on estimating Conditional Average Treatment Effect (CATE), that is, the difference in the conditional mean outcome between treatments given covariates. Traditionally, Q-Learning based approaches rely on the estimation of conditional mean outcome given treatment and covariates. However, they are subject to misspecification of the main effect model. Recently, simple and flexible one-step methods to directly learn (D-Learning) the CATE without model specifications have been proposed. However, these methods are not robust against misspecification of the propensity score model. We propose a new framework for CATE estimation, robust direct learning (RD-Learning), leading to doubly robust estimators of the treatment effect. The consistency for our CATE estimator is guaranteed if either the main effect model or the propensity score model is correctly specified. The framework can be used in both the binary and the multi-arm settings and is general enough to allow different function spaces and incorporate different generic learning algorithms. As a by-product, we develop a competitive statistical inference tool for the treatment effect, assuming the propensity score is known. We provide theoretical insights to the proposed method using risk bounds under both linear and non-linear settings. The effectiveness of our proposed method is demonstrated by simulation studies and a real data example about an AIDS Clinical Trials study.

翻译：在本文中,我们侧重于估算有条件平均治疗效果(CATE),即:给共变治疗之间的有条件平均结果差异。传统上,基于Q学习的方法依赖于对有条件平均结果的估算,但主要效果模型有误分。最近,提出了不采用模型规格的直接学习(D-Learning)CATE(D-Learning)的简单和灵活的一步骤方法。然而,这些方法对于偏向性评分模式的错误区分并不有力。我们提出了一个新的框架,用于CATE估算、强力直接学习(RD-Learning),导致对治疗效果的双重强力估计。但是,基于Q-Learnearing的方法依赖于对有条件平均结果的治疗结果和共变异的估算。如果主要效果模型或偏重度评模型有正确的规定,那么我们的CATE估计器的连贯性就会得到保证。这个框架可以在二进制和多臂设置中使用,并且很一般地允许不同的功能空间,并纳入不同的通用学习算法。作为我们研究的副产品,我们用一个具有竞争力的理论性的方法,我们用一个示范性的方法在模拟分析方法之下,我们提出的一个示范性的方法是使用一种模拟方法,在实验性的研究方法,在模拟中,在模拟研究中,在模拟中提供一种不具有示范性的研究中,我们所学进式的理论性的研究方法,用一种比较法式的理论性的方法就是一种比较法式的理论性的方法,在模拟方法,在模拟法式的理论性的方法,在模拟方法,在模拟方法之下,在模拟法式的理论性研究中提供一种比较法式的理论性方法,在模拟法式的理论性研究。我们用法系法系法系法系法系的法系的理论性研究。