We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching, a longstanding tool in observational causal inference. Interpretability is paramount in many high-stakes application of causal inference. Current tools for nonparametric estimation of both average and individualized treatment effects are black-boxes that do not allow for human auditing of estimates. Our framework uses machine learning to learn an optimal metric for matching units and estimating outcomes, thus achieving the performance of machine learning black-boxes, while being interpretable. Our general framework encompasses several published works as special cases. We provide asymptotic inference theory for our proposed framework, enabling users to construct approximate confidence intervals around estimates of both individualized and average treatment effects. We show empirically that instances of Matched Machine Learning perform on par with black-box machine learning methods and better than existing matching methods for similar problems. Finally, in our application we show how Matched Machine Learning can be used to perform causal inference even when covariate data are highly complex: we study an image dataset, and produce high quality matches and estimates of treatment effects.
翻译:我们引入了匹配机器学习这一框架,将机器学习黑盒子的灵活性与匹配的可解释性相结合,匹配一直是观察性因果推断中的一个工具。在许多高风险的因果推断应用中,可解释性至关重要。当前非参数估算平均和个性化治疗效果的工具都是黑盒子,不允许用户对估算结果进行人为审计。我们的框架使用机器学习来学习一个最佳度量,以匹配单元和估算结果,从而实现了机器学习黑盒子的性能,同时又具有可解释性。我们的广义框架包括几个已发表的工作作为特殊情况。我们为我们提出的框架提供了渐近推断理论,使用户能够构建关于个性化和平均治疗效果估计的近似置信区间。我们实证表明匹配机器学习实例的表现与黑盒子机器学习方法相当,并且比现有匹配方法更好地解决了类似问题。最后,我们在应用中展示了匹配机器学习如何在协变量数据非常复杂的情况下进行因果推断:我们研究了一组图像数据集,并生成了高质量的匹配和治疗效果估计。