We propose a constraint-based algorithm, which automatically determines causal relevance thresholds, to infer causal networks from data. We call these topological thresholds. We present two methods for determining the threshold: the first seeks a set of edges that leaves no disconnected nodes in the network; the second seeks a causal large connected component in the data. We tested these methods both for discrete synthetic and real data, and compared the results with those obtained for the PC algorithm, which we took as the benchmark. We show that this novel algorithm is generally faster and more accurate than the PC algorithm. The algorithm for determining the thresholds requires choosing a measure of causality. We tested our methods for Fisher Correlations, commonly used in PC algorithm (for instance in \cite{kalisch2005}), and further proposed a discrete and asymmetric measure of causality, that we called Net Influence, which provided very good results when inferring causal networks from discrete data. This metric allows for inferring directionality of the edges in the process of applying the thresholds, speeding up the inference of causal DAGs.
翻译:暂无翻译