Causal discovery methods aim to determine the causal direction between variables using observational data. Functional causal discovery methods, such as those based on the Linear Non-Gaussian Acyclic Model (LiNGAM), rely on structural and distributional assumptions to infer the causal direction. However, approaches for assessing causal discovery methods' performance as a function of sample size or the impact of assumption violations, inevitable in real-world scenarios, are lacking. To address this need, we propose Causal Direction Detection Rate (CDDR) diagnostic that evaluates whether and to what extent the interaction between assumption violations and sample size affects the ability to identify the hypothesized causal direction. Given a bivariate dataset of size N on a pair of variables, X and Y, CDDR diagnostic is the plotted comparison of the probability of each causal discovery outcome (e.g. X causes Y, Y causes X, or inconclusive) as a function of sample size less than N. We fully develop CDDR diagnostic in a bivariate case and demonstrate its use for two methods, LiNGAM and our new test-based causal discovery approach. We find CDDR diagnostic for the test-based approach to be more informative since it uses a richer set of causal discovery outcomes. Under certain assumptions, we prove that the probability estimates of detecting each possible causal discovery outcome are consistent and asymptotically normal. Through simulations, we study CDDR diagnostic's behavior when linearity and non-Gaussianity assumptions are violated. Additionally, we illustrate CDDR diagnostic on four real datasets, including three for which the causal direction is known.
翻译:暂无翻译