Causal phenomena associated with rare events occur across a wide range of engineering problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links, between random variables in a dynamic setting, that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel statistical independence test on data collected from time-invariant dynamical systems in which rare but consequential events occur. In particular, we exploit the time-invariance of the underlying data to construct a superimposed dataset of the system state before rare events happen at different timesteps. We then design a conditional independence test on the reorganized data. We provide non-asymptotic sample complexity bounds for the consistency of our method, and validate its performance across various simulated and real-world datasets, including incident data collected from the Caltrans Performance Measurement System (PeMS). Code containing the datasets and experiments is publicly available.
翻译:试图通过罕见事件进行动态因果发现:一种非参数条件独立性测试
翻译后的摘要:
罕见事件相关的因果现象出现在广泛的工程问题中,例如风险敏感的安全分析、事故分析和预防以及极值理论。然而,当前的因果发现方法通常无法揭示动态环境中仅在变量首次经历低概率实现时表现出来的因果链接。为了解决这个问题,我们介绍一种新颖的统计独立性测试方法,用于从发生罕见但具有后果的时间不变动态系统中收集的数据上进行因果发现。特别地,我们利用底层数据的时间不变性来构建超定数据集,其中包含在不同的时间步骤之前出现罕见事件的系统状态。然后,我们在重新组织的数据上设计了一种条件独立性测试方法。我们提供了关于该方法一致性的非渐近样本复杂度界限,并验证了其在包括从 Caltrans 性能测量系统(PeMS)收集的事件数据在内的各种模拟和实际数据集中的性能。数据集和实验的代码公开可用。