Causal graph discovery and causal effect estimation are two fundamental tasks in causal inference. While many methods have been developed for each task individually, statistical challenges arise when applying these methods jointly: estimating causal effects after running causal discovery algorithms on the same data leads to "double dipping," invalidating coverage guarantees of classical confidence intervals. To this end, we develop tools for valid post-causal-discovery inference. One key contribution is a randomized version of the greedy equivalence search (GES) algorithm, which permits a valid, finite-sample correction of classical confidence intervals. Across empirical studies, we show that a naive combination of causal discovery and subsequent inference algorithms typically leads to highly inflated miscoverage rates; at the same time, our noisy GES method provides reliable coverage control while achieving more accurate causal graph recovery than data splitting.
翻译:因果关系图的发现和因果关系估计是因果关系推断的两个基本任务。虽然已经为每个任务分别制定了许多方法,但在共同应用这些方法时会遇到统计上的挑战:在对同一数据进行因果关系发现算法后估计因果关系导致“双重稀释 ”, 使传统信任期的覆盖保障无效。 为此,我们开发了有效因果关系后发现推断的工具。 一个关键贡献是贪婪等值搜索算法的随机化版本,它允许对古典信任期进行有效、有限和抽样的更正。 在经验研究中,我们显示,因果发现和随后推断算法的天真结合通常会导致高度膨胀的覆盖率; 同时,我们噪音的GES方法提供了可靠的覆盖控制,同时实现比数据分离更准确的因果图形恢复。