A well-studied challenge that arises in the structure learning problem of causal directed acyclic graphs (DAG) is that using observational data, one can only learn the graph up to a "Markov equivalence class" (MEC). The remaining undirected edges have to be oriented using interventions, which can be very expensive to perform in applications. Thus, the problem of minimizing the number of interventions needed to fully orient the MEC has received a lot of recent attention, and is also the focus of this work. We prove two main results. The first is a new universal lower bound on the number of atomic interventions that any algorithm (whether active or passive) would need to perform in order to orient a given MEC. Our second result shows that this bound is, in fact, within a factor of two of the size of the smallest set of atomic interventions that can orient the MEC. Our lower bound is provably better than previously known lower bounds. The proof of our lower bound is based on the new notion of CBSP orderings, which are topological orderings of DAGs without v-structures and satisfy certain special properties. Further, using simulations on synthetic graphs and by giving examples of special graph families, we show that our bound is often significantly better.
翻译:研究周全的循环图的结构学习问题(DAG)中出现的一项研究周全的挑战是,使用观测数据,人们只能将图表学习到“ Markov等效类” (MEC) 。剩下的非定向边缘必须使用干预来调整方向,而干预的操作成本非常昂贵。因此,最大限度地减少完全定向MEC所需的干预数量的问题最近受到了很多关注,也是这项工作的重点。我们证明了两个主要结果。第一个是,对于任何算法(无论是主动还是被动)为调整某个特定MEC而需要执行的原子干预数量,新的普遍下限。我们的第二个结果显示,这一约束事实上是在能够引导MEC的最小的原子干预系列规模的两个因素之内。我们较低的界限比以前已知的较低界限要好得多。我们较低的界限的证明是基于CBSP订单的新概念,这是任何算法(主动或被动的)为了调整某个MEC而需要完成的原子干预数量。我们的第二个结果表明,事实上,这个界限是能够引导MEC的最小的原子干预范围。我们通常通过某些特殊的图表来展示我们更精确的图像。