A well-studied challenge that arises in the structure learning problem of causal directed acyclic graphs (DAG) is that using observational data, one can only learn the graph up to a "Markov equivalence class" (MEC). The remaining undirected edges have to be oriented using interventions, which can be very expensive to perform in applications. Thus, the problem of minimizing the number of interventions needed to fully orient the MEC has received a lot of recent attention, and is also the focus of this work. We prove two main results. The first is a new universal lower bound on the number of atomic interventions that any algorithm (whether active or passive) would need to perform in order to orient a given MEC. Our second result shows that this bound is, in fact, within a factor of two of the size of the smallest set of atomic interventions that can orient the MEC. Our lower bound is provably better than previously known lower bounds. The proof of our lower bound is based on the new notion of clique-block shared-parents (CBSP) orderings, which are topological orderings of DAGs without v-structures and satisfy certain special properties. Further, using simulations on synthetic graphs and by giving examples of special graph families, we show that our bound is often significantly better.
翻译:研究周全的循环图的结构学习问题(DAG)中出现的一项研究周全的挑战是,使用观测数据,只能将图表学习到“ Markov等效等级”(MEC) 。剩下的非定向边缘必须使用干预来调整方向,而干预的操作成本非常昂贵。因此,最大限度地减少充分定向MEC所需的干预数量的问题最近受到了很多关注,也是这项工作的重点。我们证明了两个主要结果。第一个是,对于任何算法(无论是主动还是被动)为调整某个特定MEC而需要执行的原子干预数量,新的普遍下限。我们的第二个结果显示,这一约束事实上是在能够引导MEC最小的一组原子干预规模的两个因素之内。我们较低的界限比以前已知的较低界限要好得多。我们的下界限证据是基于新的概念,即俱乐部-区块共有父母(CBSP)的订单数量,这是任何算法(无论是主动的还是被动的)需要执行的原子干预数量,以调整某个MEC。我们的算法。我们的第二个结果显示,事实上,这一界限是最小的、最小的原子干预的大小的二倍数的一个因素。我们用一些特殊的模型来展示我们的模型,更精确地显示我们的图像。