Learning causal relationships between variables is a well-studied problem in statistics, with many important applications in science. However, modeling real-world systems remain challenging, as most existing algorithms assume that the underlying causal graph is acyclic. While this is a convenient framework for developing theoretical developments about causal reasoning and inference, the underlying modeling assumption is likely to be violated in real systems, because feedback loops are common (e.g., in biological systems). Although a few methods search for cyclic causal models, they usually rely on some form of linearity, which is also limiting, or lack a clear underlying probabilistic model. In this work, we propose a novel framework for learning nonlinear cyclic causal graphical models from interventional data, called NODAGS-Flow. We perform inference via direct likelihood optimization, employing techniques from residual normalizing flows for likelihood estimation. Through synthetic experiments and an application to single-cell high-content perturbation screening data, we show significant performance improvements with our approach compared to state-of-the-art methods with respect to structure recovery and predictive performance.
翻译:变量之间的学习因果关系在统计方面是一个研究周全的问题,有许多重要的科学应用。然而,模拟现实世界系统仍然具有挑战性,因为大多数现有的算法假设基本因果图是循环性的。虽然这是发展因果推理和推理理论发展的一个方便框架,但基础模型假设在实际系统中可能遭到违反,因为反馈循环是常见的(例如生物系统)。虽然对循环因果模型的搜索方法不多,但它们通常依赖某种线性形式,这种形式也限制或缺乏一种明确的基本概率模型。在这项工作中,我们提出了一个从干预性数据(称为NDASGS-Flow)中学习非线性因果因果图形模型的新框架。我们通过直接的可能性优化,利用从剩余正常流动中得出的技术进行可能的估算。通过合成实验和对单细胞高吸附性扰动性筛选数据的应用,我们采用的方法在结构恢复和预测性能方面与最先进的方法相比取得了显著的业绩改进。