Causal discovery, beyond the inference of a network as a collection of connected dots, offers a crucial functionality in scientific discovery using artificial intelligence. The questions that arise in multiple domains, such as physics, physiology, the strategic decision in uncertain environments with multiple agents, climatology, among many others, have roots in causality and reasoning. It became apparent that many real-world temporal observations are nonlinearly related to each other. While the number of observations can be as high as millions of points, the number of temporal samples can be minimal due to ethical or practical reasons, leading to the curse-of-dimensionality in large-scale systems. This paper proposes a novel method using kernel principal component analysis and pre-images to obtain nonlinear dependencies of multivariate time-series data. We show that our method outperforms state-of-the-art causal discovery methods when the observations are restricted by time and are nonlinearly related. Extensive simulations on both real-world and synthetic datasets with various topologies are provided to evaluate our proposed methods.
翻译:除了一个网络的推论外,一个连接点的收集,原因的发现为利用人工智能的科学发现提供了一个至关重要的功能。在多个领域出现的问题,例如物理学、生理学、在具有多种物剂的不确定环境中的战略决定、气候学等等,其根源在于因果关系和推理。很明显,许多现实世界的时间观测彼此之间没有线性联系。虽然观测的数量可能高达数百万个点,但由于道德或实际原因,时间样本的数量可能微乎其微,从而导致大规模系统中的宇宙诅咒。本文提出了一种新颖的方法,利用内核主部件分析和预感来获得多变时间序列数据的非线性依赖性。我们表明,当观测受时间限制且与线性无关时,我们的方法将超越了最新水平的因果关系发现方法。我们提供了对现实世界和合成数据集进行的广泛模拟,并提供了各种表象来评估我们建议的方法。