Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:Translated abstract:
稀有事件的因果现象存在于众多的工程问题中,如风险敏感的安全分析、事故分析及预防和极值理论等。然而,当前的因果性推断方法通常无法揭示在随机变量间的因果关系,特别是在动态设置下,只有在变量首次经历低概率情况时才显现的因果关系。为了解决这个问题,我们引入了一种新颖的检验方法,适用于在稀有但具有重要影响的事件发生的静态动力系统中,收集到的数据。特别地,我们利用基础数据的静态性质,构建了一个超级数据集,其中包含了在不同时间步骤中稀有事件发生前的系统状态。然后,我们在重新组织的数据上设计了一种条件独立性检验。我们提供了该方法一致性的非渐进样本复杂度界限,并在各种模拟和真实世界数据集(包括来自Caltrans Performance Measurement System(PeMS)收集的事件数据)中验证了其性能。代码和数据集可公开获取。