Inferring nonlinear and asymmetric causal relationships between multivariate longitudinal data is a challenging task with wide-ranging application areas including clinical medicine, mathematical biology, economics and environmental research. A number of methods for inferring causal relationships within complex dynamic and stochastic systems have been proposed but there is not a unified consistent definition of causality in this context. We evaluate the performance of ten prominent bivariate causality indices for time series data, across four simulated model systems that have different coupling schemes and characteristics. In further experiments, we show that these methods may not always be invariant to real-world relevant transformations (data availability, standardisation and scaling, rounding error, missing data and noisy data). We recommend transfer entropy and nonlinear Granger causality as likely to be particularly robust indices for estimating bivariate causal relationships in real-world applications. Finally, we provide flexible open-access Python code for computation of these methods and for the model simulations.
翻译:多变量纵向数据之间非线性和不对称因果关系的推论是一项艰巨的任务,涉及广泛的应用领域,包括临床医学、数学生物学、经济学和环境研究。提出了在复杂的动态和随机系统中推断因果关系的若干方法,但在这方面没有统一的因果关系定义。我们评估四个模拟模型系统的时间序列数据的十种明显的双轨性因果关系指数的性能,这些模型有不同的组合计划和特性。在进一步实验中,我们表明这些方法不一定对现实世界的相关变化(数据的可用性、标准化和规模、圆环错误、缺失的数据和吵闹的数据)不起作用。我们建议转移恒定的恒定性和非线性因果性指数可能特别有力,用于估计现实世界应用中的双轨性因果关系。最后,我们为计算这些方法和模型模拟提供了灵活的开放Python代码。