Computational analysis of time-course data with an underlying causal structure is needed in a variety of domains, including neural spike trains, stock price movements, and gene expression levels. However, it can be challenging to determine from just the numerical time course data alone what is coordinating the visible processes, to separate the underlying prima facie causes into genuine and spurious causes and to do so with a feasible computational complexity. For this purpose, we have been developing a novel algorithm based on a framework that combines notions of causality in philosophy with algorithmic approaches built on model checking and statistical techniques for multiple hypotheses testing. The causal relationships are described in terms of temporal logic formulae, reframing the inference problem in terms of model checking. The logic used, PCTL, allows description of both the time between cause and effect and the probability of this relationship being observed. We show that equipped with these causal formulae with their associated probabilities we may compute the average impact a cause makes to its effect and then discover statistically significant causes through the concepts of multiple hypothesis testing (treating each causal relationship as a hypothesis), and false discovery control. By exploring a well-chosen family of potentially all significant hypotheses with reasonably minimal description length, it is possible to tame the algorithm's computational complexity while exploring the nearly complete search-space of all prima facie causes. We have tested these ideas in a number of domains and illustrate them here with two examples.
翻译:在一系列领域,包括神经峰值列车、股价波动和基因表达水平,都需要对时间过程数据进行计算分析,分析其内在因果关系结构。然而,仅仅从数字时间过程数据中确定协调可见过程的参数,将表面表面原因区分为真实和虚假原因,并以可行的计算复杂性来进行这种分析,可能具有挑战性。为此目的,我们一直在一个框架的基础上发展一种新型算法,将哲学的因果关系概念与基于模型检查和统计技术的算法方法结合起来,以进行多种假设测试。因果关系以时间逻辑公式描述,用模型检查来重新界定推论问题。所使用的逻辑,即PCTL,允许描述因果关系和所观察到的这种关系的可能性。我们证明这些因果关系公式及其相关的概率性,我们可能用其平均影响来计算,然后通过多种假设测试概念(将每一种因果关系作为假设)来发现具有统计意义的原因。我们通过探索各种可能具有潜在深度的模型来验证,而初步发现范围则几乎是探索各种研究的模型,从而可以合理地探索各种可能性的精确度,从而推算出各种可能具有潜在程度的模型。