In this extended abstract paper, we address the problem of interpretability and targeted regularization in causal machine learning models. In particular, we focus on the problem of estimating individual causal/treatment effects under observed confounders, which can be controlled for and moderate the effect of the treatment on the outcome of interest. Black-box ML models adjusted for the causal setting perform generally well in this task, but they lack interpretable output identifying the main drivers of treatment heterogeneity and their functional relationship. We propose a novel deep counterfactual learning architecture for estimating individual treatment effects that can simultaneously: i) convey targeted regularization on, and produce quantify uncertainty around the quantity of interest (i.e., the Conditional Average Treatment Effect); ii) disentangle baseline prognostic and moderating effects of the covariates and output interpretable score functions describing their relationship with the outcome. Finally, we demonstrate the use of the method via a simple simulated experiment.
翻译:在这份扩展的抽象文件中,我们讨论了因果机学习模型的可解释性和有针对性的正规化问题,特别是,我们着重探讨在观察到的混淆分子下估计个人因果/治疗影响的问题,这些问题可以控制并减轻治疗对利益结果的影响,根据因果关系调整的黑盒 ML模型在这项任务中总体上表现良好,但是它们缺乏可解释的产出,无法确定治疗异质及其功能关系的主要驱动因素。我们提出了一个新的深刻反事实学习结构,用以估计个人治疗效果,这种结构可以同时:(一) 传达有针对性的正规化,并围绕利息数量(即有条件平均治疗效果)量化不确定性;(二) 混合体和输出可解释的分数的基线预测和调节效应脱乱,描述它们与结果的关系。最后,我们通过简单的模拟试验来展示使用这种方法。