Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering the causal order among a subgraph of variables known to descend from some (possibly large) set of confounding covariates, i.e. a $\textit{confounder blanket}$. This is useful in many settings, for example when studying a dynamic biomolecular subsystem with genetic data providing background information. Under a structural assumption called the $\textit{confounder blanket principle}$, which we argue is essential for tractable causal discovery in high dimensions, our method accommodates graphs of low or high sparsity while maintaining polynomial time complexity. We present a structure learning algorithm that is provably sound and complete with respect to a so-called $\textit{lazy oracle}$. We design inference procedures with finite sample error control for linear and nonlinear systems, and demonstrate our approach on a range of simulated and real-world datasets. An accompanying $\texttt{R}$ package, $\texttt{cbl}$, is available from $\texttt{CRAN}$.
翻译:从观测数据中推断因果关系并不简单,但问题在高维方面特别困难。对于这些应用来说,因果发现算法通常需要参数限制或极端宽度限制。我们放松这些假设,侧重于一个重要但更为专门的问题,即恢复已知从某些(可能大)混杂的共变数组中下降的变数的子集(即$/textit{confounder grounder ground}美元)的因果顺序。这在许多环境中是有用的,例如,在研究具有遗传数据提供背景资料的动态生物分子子系统时。在称为 $\textit{confounder glanter roomt} 的结构假设下,我们认为,对于高维度的可移植因果发现至关重要,我们的方法包括低或高度的变数图,同时保持多元性时间的复杂性。我们提出的结构学习算法对所谓的 $\ textitle{lazy{lacy {trole} 我们设计了推断程序,对线性和非线性系统进行有限的抽样错误控制。