Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from a linear space, most commonly the Euclidean space. However, it is increasingly common that complex datasets collected through electronic sources, such as wearable devices and medical imaging, cannot be represented as data points from linear spaces. In this paper, we present a formal definition of causal effects for outcomes from non-linear spaces, with a focus on the Wasserstein space of cumulative distribution functions. We develop doubly robust estimators and associated asymptotic theory for these causal effects. Our framework extends to outcomes from certain Riemannian manifolds. As an illustration, we use our framework to quantify the causal effect of marriage on physical activity patterns using wearable device data collected through the National Health and Nutrition Examination Survey.
翻译:理解因果关系是现代科学最重要的目标之一。 到目前为止,因果关系推断文献几乎完全侧重于线性空间的结果,最常见的是欧几里德空间。然而,通过电子来源收集的复杂数据集,例如可磨损装置和医学成像,不能作为线性空间的数据点来表示。在本文中,我们提出了一个非线性空间结果的因果关系的正式定义,重点是累积分布功能的瓦塞尔斯坦空间。我们为这些因果关系开发了双重强势的估测器和相关的无损理论。我们的框架延伸至某些里曼式的元体。举例来说,我们利用我们的框架,利用通过国家健康和营养调查收集的可磨损装置数据来量化婚姻对体育活动模式的因果关系。