Granger causality is among the widely used data-driven approaches for causal analysis of time series data with applications in various areas including economics, molecular biology, and neuroscience. Two of the main challenges of this methodology are: 1) over-fitting as a result of limited data duration, and 2) correlated process noise as a confounding factor, both leading to errors in identifying the causal influences. Sparse estimation via the LASSO has successfully addressed these challenges for parameter estimation. However, the classical statistical tests for Granger causality resort to asymptotic analysis of ordinary least squares, which require long data durations to be useful and are not immune to confounding effects. In this work, we close this gap by introducing a LASSO-based statistic and studying its non-asymptotic properties under the assumption that the true models admit sparse autoregressive representations. We establish that the sufficient conditions of LASSO also suffice for robust identification of Granger causal influences. We also characterize the false positive error probability of a simple thresholding rule for identifying Granger causal effects. We present simulation studies and application to real data to compare the performance of the ordinary least squares and LASSO in detecting Granger causal influences, which corroborate our theoretical results.
翻译:由数据驱动的因果关系是广泛使用的数据驱动的方法,用于对时间序列数据进行因果分析,并应用于各个领域,包括经济学、分子生物学和神经科学,这种方法的两个主要挑战是:(1) 由于数据期限有限,过度使用数据,和(2) 相关过程噪音是一个令人困惑的因素,导致在确定因果关系方面出现错误;通过LASSO进行的粗略估计成功地解决了参数估计的这些挑战;然而,对Gerger因果关系的典型统计测试采用对普通最小方形的无症状分析,这需要较长的数据期限才能有用,不能避免混乱效应;在这项工作中,我们通过采用基于LASSO的统计数据和研究其非症状特性来弥补这一差距,前提是假设真实模型允许稀少的自反射性表示;我们确定LASSSO的充分条件也足以有力地确定Granger因果关系。我们还将简单临界值规则的误差概率描述为确定Granger因果关系效果。我们提出模拟研究和应用真实数据,以比较我们普通的因果影响的表现,在SARSO 和LASSO中,用以比较普通的理论性结果的校验。