Causal identification is at the core of the causal inference literature, where complete algorithms have been proposed to identify causal queries of interest. The validity of these algorithms hinges on the restrictive assumption of having access to a correctly specified causal structure. In this work, we study the setting where a probabilistic model of the causal structure is available. Specifically, the edges in a causal graph are assigned probabilities which may, for example, represent degree of belief from domain experts. Alternatively, the uncertainly about an edge may reflect the confidence of a particular statistical test. The question that naturally arises in this setting is: Given such a probabilistic graph and a specific causal effect of interest, what is the subgraph which has the highest plausibility and for which the causal effect is identifiable? We show that answering this question reduces to solving an NP-hard combinatorial optimization problem which we call the edge ID problem. We propose efficient algorithms to approximate this problem, and evaluate our proposed algorithms against real-world networks and randomly generated graphs.
翻译:因果关系推断文献的核心是因果识别,其中提出了完整的算法,以找出因果查询。这些算法的有效性取决于获取正确指定的因果结构这一限制性假设。在这项工作中,我们研究了因果结构概率模型的设置。具体地说,因果图的边缘被分配为概率,这可能代表领域专家的信仰程度。或者,边缘的不确定性可能反映特定统计测试的信心。在这个环境中自然产生的问题是:鉴于这种概率图和特定因果效应,什么是具有最高可辨性且因果关系可辨明的子图?我们表明,解答这一问题会减少解决一个称为边缘识别问题的NP硬组合优化问题。我们建议高效的算法来解决这一问题,并对照现实世界网络和随机生成的图表评估我们提议的算法。