An acyclic causal structure can be described using a directed acyclic graph (DAG) with arrows indicating causation. The task of learning these structures from data is known as ``causal discovery''. Diverse populations or changing environments can sometimes give rise to heterogeneous data. This heterogeneity can be thought of as a mixture model with multiple ``sources'', each exerting their own distinct signature on the observed variables. From this perspective, the source is a latent common cause for every observed variable. While some methods for causal discovery are able to work around unobserved confounding in special cases, the only known ways to deal with a global confounder (such as a latent class) involve parametric assumptions. These assumptions are restrictive, especially for discrete variables. By focusing on discrete observables, we demonstrate that globally confounded causal structures can still be identifiable without parametric assumptions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.
翻译:暂无翻译