Causal discovery from time-series data has been a central task in machine learning. Recently, Granger causality inference is gaining momentum due to its good explainability and high compatibility with emerging deep neural networks. However, most existing methods assume structured input data and degenerate greatly when encountering data with randomly missing entries or non-uniform sampling frequencies, which hampers their applications in real scenarios. To address this issue, here we present CUTS, a neural Granger causal discovery algorithm to jointly impute unobserved data points and build causal graphs, via plugging in two mutually boosting modules in an iterative framework: (i) Latent data prediction stage: designs a Delayed Supervision Graph Neural Network (DSGNN) to hallucinate and register unstructured data which might be of high dimension and with complex distribution; (ii) Causal graph fitting stage: builds a causal adjacency matrix with imputed data under sparse penalty. Experiments show that CUTS effectively infers causal graphs from unstructured time-series data, with significantly superior performance to existing methods. Our approach constitutes a promising step towards applying causal discovery to real applications with non-ideal observations.
翻译:从时间序列数据中发现因果是机器学习的一项核心任务。最近,由于能很好地解释和与新兴的深神经网络高度兼容,重度因果关系推断正在增加势头。然而,大多数现有方法假定有结构化输入数据,在遇到随机缺失条目或非统一取样频率的数据时,大多数现有方法大为退化,这妨碍了数据在真实情况下的应用。为了解决这一问题,我们在此提出神经聚合变异因果发现算法,即通过在迭接框架中插入两个相互促进的模块,共同估算未观测的数据点和构建因果图表:(一) 延迟数据预测阶段:设计一个延迟监督图神经网络(DSGNN),用于幻觉和登记无结构化数据,这些数据可能具有高维度且分布复杂;(二) 剖面图安装阶段:建立一个因果相近矩阵,其中含精密数据,受到微量的处罚。实验显示,CUTS从未结构化的时间序列数据中有效地推断出因果图表,其性能大大优于现有方法。我们的方法是朝向将因果性发现实际应用的一步。