Estimating counterfactual outcomes over time from observational data is relevant for many applications (e.g., personalized medicine). Yet, state-of-the-art methods build upon simple long short-term memory (LSTM) networks, thus rendering inferences for complex, long-range dependencies challenging. In this paper, we develop a novel Causal Transformer for estimating counterfactual outcomes over time. Our model is specifically designed to capture complex, long-range dependencies among time-varying confounders. For this, we combine three transformer subnetworks with separate inputs for time-varying covariates, previous treatments, and previous outcomes into a joint network with in-between cross-attentions. We further develop a custom, end-to-end training procedure for our Causal Transformer. Specifically, we propose a novel counterfactual domain confusion loss to address confounding bias: it aims to learn adversarial balanced representations, so that they are predictive of the next outcome but non-predictive of the current treatment assignment. We evaluate our Causal Transformer based on synthetic and real-world datasets, where it achieves superior performance over current baselines. To the best of our knowledge, this is the first work proposing transformer-based architecture for estimating counterfactual outcomes from longitudinal data.
翻译:从观察数据中对长期反事实结果作出估计,对于许多应用(例如个性化医学)是相关的。然而,最先进的方法建立在简单的长期短期内存(LSTM)网络的基础上,从而给复杂、长期依赖性带来挑战。在本文中,我们开发了一个新的Causal变形器,用来估计长期反事实结果。我们的模型专门设计是为了捕捉时间变化的共解者之间的复杂、长期依赖性。在这方面,我们把三个变压器子网络与对时间变化的共变式、以往的治疗和以往的结果的不同投入合并成一个联合网络。我们进一步为我们的Causal变形器开发了一个定制、端对端培训程序。具体地说,我们提出了一个新的反事实变形变形器损失,以解决令人难以理解的偏差:它旨在学习对称平衡的表述,以便它们预测下一个结果,而不是预测当前处理任务。我们从当前变形变形变形变形的变形器变形器和以前的处理结果,我们从当前变形器变形器中评估了我们当前的变形变形器变形模型的最高级的模型,我们根据合成和变形变形变形数据模型得出了真实数据模型的模型的模型,从而得出了目前最高级的模型的模型。