Learning causal relationships between variables is a fundamental task in causal inference and directed acyclic graphs (DAGs) are a popular choice to represent the causal relationships. As one can recover a causal graph only up to its Markov equivalence class from observations, interventions are often used for the recovery task. Interventions are costly in general and it is important to design algorithms that minimize the number of interventions performed. In this work, we study the problem of identifying the smallest set of interventions required to learn the causal relationships between a subset of edges (target edges). Under the assumptions of faithfulness, causal sufficiency, and ideal interventions, we study this problem in two settings: when the underlying ground truth causal graph is known (subset verification) and when it is unknown (subset search). For the subset verification problem, we provide an efficient algorithm to compute a minimum sized interventional set; we further extend these results to bounded size non-atomic interventions and node-dependent interventional costs. For the subset search problem, in the worst case, we show that no algorithm (even with adaptivity or randomization) can achieve an approximation ratio that is asymptotically better than the vertex cover of the target edges when compared with the subset verification number. This result is surprising as there exists a logarithmic approximation algorithm for the search problem when we wish to recover the whole causal graph. To obtain our results, we prove several interesting structural properties of interventional causal graphs that we believe have applications beyond the subset verification/search problems studied here.
翻译:变量之间的学习因果关系是因果推断和定向循环图(DAGs)的一个基本任务。根据忠诚、因果充足和理想干预的假设,我们可以在两种情况下研究这一问题:当基本地面因果图表为人所知时(亚位核实),当它为未知时(次位搜索),干预通常用于恢复任务。干预一般费用高昂,设计算法以尽量减少干预次数十分重要。在这项工作中,我们研究如何确定最小的一套干预措施,以了解一组边缘(目标边缘)之间的因果关系。根据忠诚、因果充足和理想干预的假设,我们可在两种情况下研究这一问题:当基本地面因果图表为人所知时(次位核实),当它被未知时(次位查),我们通常会用有效的算法来计算最小规模的干预成套干预次数。在最坏的情况下,我们无法找到一个离奇的因果的因果分析结果,当我们在此情况下,我们无法找到任何算法(与调整或随机化的),当我们所处的因果分析结果被证实时,这种结果会更精确地算。