Causal discovery from observational data is a very challenging, often impossible, task. However, estimating the causal structure is possible under certain assumptions on the data-generating process. Many commonly used methods rely on the additivity of the noise in the structural equation models. Additivity implies that the variance or the tail of the effect, given the causes, is invariant; the cause only affects the mean. However, the tail or other characteristics of the random variable can provide different information about the causal structure. Such cases have received only very little attention in the literature. It has been shown that the causal graph is identifiable under different models, such as linear non-Gaussian, post-nonlinear, or quadratic variance functional models. We introduce a new class of models called the Conditional Parametric Causal Models (CPCM), where the cause affects the effect in some of the characteristics of interest. We use sufficient statistics to show the identifiability of the CPCM models in the exponential family of conditional distributions. We also propose an algorithm for estimating the causal structure from a random sample under CPCM. Its empirical properties are studied for various data sets, including an application on the expenditure behavior of residents of the Philippines.
翻译:根据非可加条件参数因果模型的因果图可识别性
从观察数据中发现因果关系是一个非常具有挑战性,通常是不可能的任务。但是,在数据生成过程上做出一定的假设,是可以估计因果结构的。许多常用的方法依赖于结构方程模型中噪声的可加性。可加性意味着效应的方差或尾部在给定原因时是不变的;因果只影响均值。然而,随机变量的尾部或其他特征可以提供有关因果结构的不同信息。这些情况在文献中只受到很少关注。已经证明,在不同的模型下,如线性非高斯、后非线性或二次方差函数模型,因果图是可识别的。我们介绍了一类新的模型,称为条件参数因果模型(CPCM),其原因在某些感兴趣的特征上会影响到效应。我们使用充分统计量来证明指数族条件分布下的CPCM模型的可识别性。我们还提出了一种算法,在CPCM模型下随机抽样来估计因果结构。我们研究了其在各种数据集上的经验性质,包括在菲律宾居民支出行为方面的应用。