评估时间序列之间线性依赖性直接计量和多变量计量的重要性 (Assessing the Significance of Directed and Multivariate Measures of Linear Dependence Between Time Series)

Inferring linear dependence between time series is central to our understanding of natural and artificial systems. Unfortunately, the hypothesis tests that are used to determine statistically significant directed or multivariate relationships from time-series data often yield spurious associations (Type I errors) or omit causal relationships (Type II errors). This is due to the autocorrelation present in the analysed time series -- a property that is ubiquitous across diverse applications, from brain dynamics to climate change. Here we show that, for limited data, this issue cannot be mediated by fitting a time-series model alone (e.g., in Granger causality or prewhitening approaches), and instead that the degrees of freedom in statistical tests should be altered to account for the effective sample size induced by cross-correlations in the observations. This insight enabled us to derive modified hypothesis tests for any multivariate correlation-based measures of linear dependence between covariance-stationary time series, including Granger causality and mutual information with Gaussian marginals. We use both numerical simulations (generated by autoregressive models and digital filtering) as well as recorded fMRI-neuroimaging data to show that our tests are unbiased for a variety of stationary time series. Our experiments demonstrate that the commonly used $F$- and $\chi^2$-tests can induce significant false-positive rates of up to $100\%$ for both measures, with and without prewhitening of the signals. These findings suggest that many dependencies reported in the scientific literature may have been, and may continue to be, spuriously reported or missed if modified hypothesis tests are not used when analysing time series.

翻译：时间序列之间线性依赖的推论是我们对自然和人工系统的理解的核心。不幸的是,用于确定时间序列数据中具有统计意义的重要定向或多变关系的假设测试往往产生虚假的关联( 类型一错误) 或忽略因果关系( 类型二错误 ) 。这是由于分析的时间序列中存在的自动调节关系 -- -- 这种属性在从大脑动态到气候变化等各种应用之间都是无处不在的。我们在这里显示,对于有限的数据来说,这一问题无法通过仅仅安装一个时间序列模型( 例如, 在Garger因果性或白前方法中)来调解, 而用来确定具有统计序列中具有统计意义的重要指示或多变异关系关系的假设测试往往产生虚假的关联关系关系关系关系关系关系关系关系关系关系关系关系关系关系关系关系关系关系。我们使用的数字模拟( 例如, 在Grightientireal Incentientireal Reportation2 中, 在不出现错误的数值模型和数字过滤结果时, 在记录我们所使用的数据序列时, 将显示我们所用的结果的精确度分析结果, 可能用于记录。