In this article, we describe the algorithms for causal structure learning from time series data that won the Causality 4 Climate competition at the Conference on Neural Information Processing Systems 2019 (NeurIPS). We examine how our combination of established ideas achieves competitive performance on semi-realistic and realistic time series data exhibiting common challenges in real-world Earth sciences data. In particular, we discuss a) a rationale for leveraging linear methods to identify causal links in non-linear systems, b) a simulation-backed explanation as to why large regression coefficients may predict causal links better in practice than small p-values and thus why normalising the data may sometimes hinder causal structure learning. For benchmark usage, we detail the algorithms here and provide implementations at https://github.com/sweichwald/tidybench . We propose the presented competition-proven methods for baseline benchmark comparisons to guide the development of novel algorithms for structure learning from time series.
翻译:在2019年神经信息处理系统会议(NeurIPS)上,我们介绍了从赢得因果关系4气候竞争的时间序列数据中学习因果结构的算法。我们研究了我们既有想法的结合如何在半现实和现实的时间序列数据上取得竞争性表现,在现实世界地球科学数据中呈现共同挑战。我们特别讨论了a) 利用线性方法查明非线性系统中因果联系的理由,b) 模拟后支持的解释,说明为什么大型回归系数在实际中预测因果联系可能比小p-价值好,因此数据正常化有时会阻碍因果结构学习。关于基准使用,我们详细介绍了这里的算法,并在https://github.com/sweichwald/tidybench提供实施情况。我们提出了通过竞争验证基准比较的方法,以指导从时间序列中学习结构的新算法的发展。