Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from the Euclidean space $\mathbb{R}^p$. However, it is increasingly common that complex datasets are best summarized as data points in non-linear spaces. In this paper, we present a novel framework of causal effects for outcomes from the Wasserstein space of cumulative distribution functions, which in contrast to the Euclidean space, is non-linear. We develop doubly robust estimators and associated asymptotic theory for these causal effects. As an illustration, we use our framework to quantify the causal effect of marriage on physical activity patterns using wearable device data collected through the National Health and Nutrition Examination Survey.
翻译:理解因果关系是现代科学最重要的目标之一。 到目前为止,因果推断文献几乎完全侧重于来自欧洲空间的产物 $\ mathbb{R ⁇ p$。然而,复杂数据集最好被总结为非线性空间的数据点,这一点越来越普遍。在本文中,我们提出了一个新的框架,说明瓦塞斯坦累积分布功能空间的结果的因果效应,与欧几里德空间不同,该空间是非线性的。我们为这些因果效应开发了双重强力的估测器和相关的无损理论。举例来说,我们利用我们的框架,利用通过国家健康和营养调查收集的可磨损设备数据,量化婚姻对体育活动模式的因果效应。