面向时间序列结构学习的因果排序方法 (Causal Ordering for Structure Learning From Time Series)

Predicting causal structure from time series data is crucial for understanding complex phenomena in physiology, brain connectivity, climate dynamics, and socio-economic behaviour. Causal discovery in time series is hindered by the combinatorial complexity of identifying true causal relationships, especially as the number of variables and time points grow. A common approach to simplify the task is the so-called ordering-based methods. Traditional ordering methods inherently limit the representational capacity of the resulting model. In this work, we fix this issue by leveraging multiple valid causal orderings, instead of a single one as standard practice. We propose DOTS (Diffusion Ordered Temporal Structure), using diffusion-based causal discovery for temporal data. By integrating multiple orderings, DOTS effectively recovers the transitive closure of the underlying directed acyclic graph, mitigating spurious artifacts inherent in single-ordering approaches. We formalise the problem under standard assumptions such as stationarity and the additive noise model, and leverage score matching with diffusion processes to enable efficient Hessian estimation. Extensive experiments validate the approach. Empirical evaluations on synthetic and real-world datasets demonstrate that DOTS outperforms state-of-the-art baselines, offering a scalable and robust approach to temporal causal discovery. On synthetic benchmarks ($d{=}\!3-\!6$ variables, $T{=}200\!-\!5{,}000$ samples), DOTS improves mean window-graph $F1$ from $0.63$ (best baseline) to $0.81$. On the CausalTime real-world benchmark ($d{=}20\!-\!36$), while baselines remain the best on individual datasets, DOTS attains the highest average summary-graph $F1$ while halving runtime relative to graph-optimisation methods. These results establish DOTS as a scalable and accurate solution for temporal causal discovery.

翻译：从时间序列数据中推断因果结构对于理解生理学、脑连接、气候动力学以及社会经济行为中的复杂现象至关重要。时间序列中的因果发现受到识别真实因果关系的组合复杂性的阻碍，尤其是随着变量数量和时点的增加。简化该任务的常用方法是所谓的基于排序的方法。传统的排序方法本质上限制了所得模型的表示能力。在本工作中，我们通过利用多个有效因果排序（而非标准实践中的单一排序）来解决此问题。我们提出了DOTS（基于扩散的时序结构排序），利用基于扩散的因果发现方法处理时序数据。通过整合多个排序，DOTS能有效恢复底层有向无环图的传递闭包，从而减轻单一排序方法固有的伪影问题。我们在平稳性和加性噪声模型等标准假设下形式化了该问题，并利用扩散过程的分数匹配实现高效的海森矩阵估计。大量实验验证了该方法的有效性。在合成和真实数据集上的实证评估表明，DOTS优于现有先进基线方法，为时序因果发现提供了可扩展且稳健的解决方案。在合成基准测试中（$d{=}3-\!6$个变量，$T{=}200\!-\!5{,}000$个样本），DOTS将窗口图平均$F1$分数从基线最优值$0.63$提升至$0.81$。在CausalTime真实世界基准测试中（$d{=}20\!-\!36$），虽然基线方法在个别数据集上表现最佳，但DOTS获得了最高的平均摘要图$F1$分数，同时将运行时间相较于图优化方法减少一半。这些结果确立了DOTS作为时序因果发现的可扩展且精确的解决方案。