Constraint-based causal discovery algorithms learn part of the causal graph structure by systematically testing conditional independences observed in the data. These algorithms, such as the PC algorithm and its variants, rely on graphical characterizations of the so-called equivalence class of causal graphs proposed by Pearl. However, constraint-based causal discovery algorithms struggle when data is limited since conditional independence tests quickly lose their statistical power, especially when the conditioning set is large. To address this, we propose using conditional independence tests where the size of the conditioning set is upper bounded by some integer $k$ for robust causal discovery. The existing graphical characterizations of the equivalence classes of causal graphs are not applicable when we cannot leverage all the conditional independence statements. We first define the notion of $k$-Markov equivalence: Two causal graphs are $k$-Markov equivalent if they entail the same conditional independence constraints where the conditioning set size is upper bounded by $k$. We propose a novel representation that allows us to graphically characterize $k$-Markov equivalence between two causal graphs. We propose a sound constraint-based algorithm called the $k$-PC algorithm for learning this equivalence class. Finally, we conduct synthetic, and semi-synthetic experiments to demonstrate that the $k$-PC algorithm enables more robust causal discovery in the small sample regime compared to the baseline PC algorithm.
翻译:以限制为基础的因果发现算法通过系统地测试数据中观察到的有条件独立来学习因果图结构的一部分。这些算法,例如PC算法及其变异法,依靠Pearl提出的所谓因果图表等同类别的图形特征。然而,如果数据有限,由于有条件独立测试而导致的数据有限,基于限制的因果发现算法会迅速丧失其统计能力,特别是当调制数据集大时。为了解决这个问题,我们提议使用有条件的独立测试,其中调制数据集的大小被某种整数美元所约束,以获得稳健的因果发现。当我们不能利用所有有条件的独立声明时,现有因果图表等同类别的图形特征不适用。我们首先定义了美元-马尔科夫等同的图形特征概念:两个因果性图表是美元-马尔科夫等同的,如果它们包含相同的条件性独立限制,而设定的调制规模又高为$k美元。为了解决这个问题,我们建议使用一种新表达方式,让我们以图形形式描述两种因果等同的美元。我们提议一种基于稳的因果图表,要求以美元-PC等值的等值的等值等值的微限制算算法,我们要求美元- 将基的基化的基数的基基数的基数法,以学习该等同质化的基数的基基数的基数法,以便学习了基基基数的基数的试算法。