Within the field of causal inference, we consider the problem of estimating heterogeneous treatment effects from data. We propose and validate a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE. Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features. In contrast to previous approaches that encourage the distribution of representations to be treatment-invariant, we leverage a genetic algorithm that optimizes over representations useful for predicting the outcome to select those less useful for predicting the treatment. This allows us to retain information within the features useful for predicting outcome even if that information may be related to treatment assignment. We validate our method on synthetic examples and illustrate its use on a real life dataset.
翻译:在因果推断领域,我们考虑从数据中估算不同处理效应的问题。我们提出并验证一种新颖的学习特征表述方法,以帮助估计有条件平均处理效应或CATE。我们的方法侧重于神经网络中的一个中间层,该神经网络受过训练,可以预测这些特征的结果。与以往鼓励将表述分布为治疗变量的方法不同,我们利用一种基因算法,在预测结果时,比预测结果的表示法发挥最佳作用,以选择对预测治疗作用较小的代表法。这使我们能够保留在预测结果的特征内的信息,即使这些信息可能与治疗任务有关。我们验证了我们的合成实例方法,并说明了其在真实生命数据集中的用途。