Estimating heterogeneous treatment effects (HTEs) in time-varying settings is particularly challenging, as the probability of observing certain treatment sequences decreases exponentially with longer prediction horizons. Thus, the observed data contain little support for many plausible treatment sequences, which creates severe overlap problems. Existing meta-learners for the time-varying setting typically assume adequate treatment overlap, and thus suffer from exploding estimation variance when the overlap is low. To address this problem, we introduce a novel overlap-weighted orthogonal (WO) meta-learner for estimating HTEs that targets regions in the observed data with high probability of receiving the interventional treatment sequences. This offers a fully data-driven approach through which our WO-learner can counteract instabilities as in existing meta-learners and thus obtain more reliable HTE estimates. Methodologically, we develop a novel Neyman-orthogonal population risk function that minimizes the overlap-weighted oracle risk. We show that our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions. Further, our WO-learner is fully model-agnostic and can be applied to any machine learning model. Through extensive experiments with both transformer and LSTM backbones, we demonstrate the benefits of our novel WO-learner.
翻译:在时变环境中估计异质性治疗效果(HTEs)尤为困难,因为观察到特定治疗序列的概率随预测时域的延长呈指数级下降。因此,观测数据对许多合理治疗序列的支持度极低,这造成了严重的重叠问题。现有针对时变环境的元学习器通常假设治疗重叠充分,因此在重叠度低时会出现估计方差爆炸的问题。为解决这一问题,我们提出了一种新颖的重叠加权正交(WO)元学习器,用于估计HTEs,其目标在于观测数据中接受干预治疗序列概率较高的区域。这提供了一种完全数据驱动的方法,通过该方法,我们的WO学习器能够抵消现有元学习器中的不稳定性,从而获得更可靠的HTE估计。在方法论上,我们开发了一种新颖的Neyman正交总体风险函数,该函数最小化了重叠加权的理想风险。我们证明了我们的WO学习器具有Neyman正交性的优良特性,这意味着其对干扰函数的误设具有鲁棒性。此外,我们的WO学习器完全与模型无关,可应用于任何机器学习模型。通过使用Transformer和LSTM骨干网络进行的大量实验,我们展示了我们新颖的WO学习器的优势。