Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:与稀有事件相关的因果关系现象发生在一系列广泛的工程问题中,例如风险敏感安全分析、事故分析和预防以及极端价值理论。然而,目前的因果发现方法往往无法发现动态环境中随机变量之间的因果关系,这些变量只有在变量首先经历低概率认识时才显现出来。为了解决这一问题,我们对从发生罕见但附带事件的时变动态系统收集的数据进行新的统计独立测试。特别是,我们利用基础数据的时间差来构建系统状态的超常数据集,在罕见事件在不同时间步调发生之前建立系统状态的超常数据集。然后,我们设计对重组数据进行有条件的独立测试。我们为方法的一致性提供了非零位抽样复杂性,并验证了各种模拟和真实世界数据集的性能,包括从Caltrans绩效测量系统(PEMS)收集的事件数据。包含数据集和实验的代码是公开的。