Going beyond correlations, the understanding and identification of causal relationships in observational time series, an important subfield of Causal Discovery, poses a major challenge. The lack of access to a well-defined ground truth for real-world data creates the need to rely on synthetic data for the evaluation of these methods. Existing benchmarks are limited in their scope, as they either are restricted to a "static" selection of data sets, or do not allow for a granular assessment of the methods' performance when commonly made assumptions are violated. We propose a flexible and simple to use framework for generating time series data, which is aimed at developing, evaluating, and benchmarking time series causal discovery methods. In particular, the framework can be used to fine tune novel methods on vast amounts of data, without "overfitting" them to a benchmark, but rather so they perform well in real-world use cases. Using our framework, we evaluate prominent time series causal discovery methods and demonstrate a notable degradation in performance when their assumptions are invalidated and their sensitivity to choice of hyperparameters. Finally, we propose future research directions and how our framework can support both researchers and practitioners.
翻译:超越相关关系,在观察时间序列中理解和查明因果关系,这是Causal Discovery的一个重要的子领域,这是一个重大挑战。缺乏对真实世界数据精确的地面真相的获取途径,使得需要依赖合成数据来评估这些方法。现有基准的范围有限,因为它们要么局限于“静态”选择数据集,要么不允许在通常假设被违反时对方法的性能进行粒子评估。我们提议一个灵活和简单框架,用于生成时间序列数据,目的是开发、评估和基准时间序列的因果关系发现方法。特别是,该框架可用于调整大量数据的新颖方法,而“不适应”于基准,相反,它们在现实世界使用案例中表现良好。我们利用框架,评估突出的时间序列因果关系发现方法,并在假设无效和对选择超常参数敏感时显示业绩明显退化。最后,我们提出未来研究方向,以及我们的框架如何支持研究人员和从业人员。