Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability: factors that introduce uncertainty in estimating the underlying structural causal model (SCM). Selecting experiments (interventions) based on the uncertainty arising from both factors can expedite the identification of the SCM. Existing methods in experimental design for causal discovery from limited data either rely on linear assumptions for the SCM or select only the intervention target. This work incorporates recent advances in Bayesian causal discovery into the Bayesian optimal experimental design framework, allowing for active causal discovery of large, nonlinear SCMs while selecting both the interventional target and the value. We demonstrate the performance of the proposed method on synthetic graphs (Erdos-R\`enyi, Scale Free) for both linear and nonlinear SCMs as well as on the \emph{in-silico} single-cell gene regulatory network dataset, DREAM.
翻译:由观测和干预数据得出的因果发现由于数据有限和不可核实性而具有挑战性:在估计基本结构性因果模型(SCM)时带来不确定性的因素。根据两个因素产生的不确定性进行选择实验(干预)可以加快确定SCM。从有限数据中进行因果发现试验设计的现有方法要么依靠SCM线性假设,要么只选择干预目标。这项工作将巴伊西亚因果发现的最新进展纳入巴伊西亚最佳实验设计框架,允许在选择干预目标及价值的同时积极因果发现大型、非线性SCM。我们展示了拟议用于线性和非线性SCMs合成图(ERdos-R ⁇ enyi,Scalefree)以及单细胞基因管理网络数据集(DREAM)的绩效。