Reconstructing the causal relationships behind the phenomena we observe is a fundamental challenge in all areas of science. Discovering causal relationships through experiments is often infeasible, unethical, or expensive in complex systems. However, increases in computational power allow us to process the ever-growing amount of data that modern science generates, leading to an emerging interest in the causal discovery problem from observational data. This work evaluates the LPCMCI algorithm, which aims to find generators compatible with a multi-dimensional, highly autocorrelated time series while some variables are unobserved. We find that LPCMCI performs much better than a random algorithm mimicking not knowing anything but is still far from optimal detection. Furthermore, LPCMCI performs best on auto-dependencies, then contemporaneous dependencies, and struggles most with lagged dependencies. The source code of this project is available online.
翻译:重新构建我们所观察到的现象背后的因果关系是科学所有领域的一个基本挑战。 通过实验发现因果关系往往不可行、不道德或复杂系统中费用昂贵。 然而,计算能力的增加使我们能够处理现代科学产生的越来越多的数据,从而从观察数据中产生对因果关系发现问题的兴趣。这项工作评估了LPCMCI算法,该算法旨在找到与多维、高度自动化相关的时间序列兼容的发电机,而有些变量则未被观测。我们发现LPCMCI比一个随机算法更好,该算法不知情,但远未得到最佳检测。此外,LPCMCI在自动依赖性方面表现最好,然后是同时依赖性,并且与最依赖性最强的抗争。这个项目的源代码可以在网上查阅。