Selecting powerful predictors for an outcome is a cornerstone task for machine learning. However, some types of questions can only be answered by identifying the predictors that causally affect the outcome. A recent approach to this causal inference problem leverages the invariance property of a causal mechanism across differing experimental environments (Peters et al., 2016; Heinze-Deml et al., 2018). This method, invariant causal prediction (ICP), has a substantial computational defect -- the runtime scales exponentially with the number of possible causal variables. In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors. Each of these tests relies on the minimization of a novel loss function -- the Wasserstein variance -- that is derived from tools in optimal transport theory and is used to quantify distributional variability across environments. We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
翻译:选择一个结果的强大预测器是机器学习的基石任务。 然而, 某些类型的问题只能通过确定因果影响结果的预测器才能解答。 最近对这个因果推论问题采取的办法,在不同实验环境中利用因果机制的因果属性(Peter等人,2016年;Heinze- Deml等人,2018年)。 这种方法,即变化性因果预测(ICP),存在着巨大的计算缺陷 -- -- 运行时间尺度与可能的因果变数数量成倍的指数。在这项工作中,我们表明比较方案采用的方法可以重新拟订为一系列非参数性测试,在预测器数量上以线性标定尺度。其中每一项测试都依赖于尽量减少新的损失函数 -- -- 瓦列斯特因斯坦差异 -- --,这些功能来自最佳运输理论的工具,并用于量化各环境的分布变异性。我们根据温和的假设证明,我们的方法能够恢复一套可识别的直接原因,我们在实验中证明它与其他基准因果发现算法具有竞争力。