Learning causal structures from observation and experimentation is a central task in many domains. For example, in biology, recent advances allow us to obtain single-cell expression data under multiple interventions such as drugs or gene knockouts. However, a key challenge is that often the targets of the interventions are uncertain or unknown. Thus, standard causal discovery methods can no longer be used. To fill this gap, we propose a Bayesian framework (BaCaDI) for discovering the causal structure that underlies data generated under various unknown experimental/interventional conditions. BaCaDI is fully differentiable and operates in the continuous space of latent probabilistic representations of both causal structures and interventions. This enables us to approximate complex posteriors via gradient-based variational inference and to reason about the epistemic uncertainty in the predicted structure. In experiments on synthetic causal discovery tasks and simulated gene-expression data, BaCaDI outperforms related methods in identifying causal structures and intervention targets. Finally, we demonstrate that, thanks to its rigorous Bayesian approach, our method provides well-calibrated uncertainty estimates.
翻译:从观察和实验中学习因果结构是许多领域的一项核心任务,例如生物学领域,最近的进展使我们能够在多种干预措施(如药物或基因击倒)下获得单细胞表达数据。然而,一个关键挑战是,干预的目标往往不确定或未知。因此,无法再使用标准的因果发现方法。为填补这一空白,我们提议了一个贝叶西亚框架(巴卡迪框架),用于发现在各种未知实验/干预条件下生成的数据所基于的因果结构。巴卡迪是一个完全不同的框架,在因果结构和干预措施的潜在概率表现的连续空间中运作。这使我们能够通过基于梯度的变异推论来估计复杂的后遗症,并解释预测结构中存在的总体不确定性。在合成因果发现任务和模拟基因表达数据的实验中,巴凯迪在确定因果结构和干预目标时将采用相关方法。最后,我们证明,由于采用严格的巴伊西亚方法,我们的方法提供了准确的不确定性估计。