Causal discovery from observational data is a very challenging, often impossible, task. However, estimating the causal structure is possible under certain assumptions on the data-generating process. Many commonly used methods rely on the additivity of the noise in the structural equation models. Additivity implies that the variance or the tail of the effect, given the causes, is invariant; the cause only affects the mean. In many applications, it is desirable to model the tail or other characteristics of the random variable since they can provide different information about the causal structure. However, models for causal inference in such cases have received only very little attention. It has been shown that the causal graph is identifiable under different models, such as linear non-Gaussian, post-nonlinear, or quadratic variance functional models. We introduce a new class of models called the Conditional Parametric Causal Models (CPCM), where the cause affects the effect in some of the characteristics of interest.We use the concept of sufficient statistics to show the identifiability of the CPCM models, focusing mostly on the exponential family of conditional distributions.We also propose an algorithm for estimating the causal structure from a random sample under CPCM. Its empirical properties are studied for various data sets, including an application on the expenditure behavior of residents of the Philippines.
翻译:在非加性条件参数因果模型下的因果图可识别性
从观察数据中推断因果关系是非常具有挑战性的,通常是不可能的。然而,在数据生成过程的某些假设下,估计因果结构是可能的。许多常用方法依赖于结构方程模型中噪声的可加性。可加性意味着,在给定原因的情况下,效应的方差或尾部不变;原因仅影响平均值。在许多应用中,希望对随机变量的尾部或其他特征进行建模,因为它们可以提供关于因果结构的不同信息。然而,用于此类情况的因果推断模型仅受到极少关注。已经表明,在不同模型(如线性非高斯模型,后非线性模型或二次变差函数模型)下,因果图是可识别的。我们引入了一个新的模型类别,称为条件参数因果模型(CPCM),其中原因在一些感兴趣的特征方面影响效应。我们使用充分统计量的概念来展示CPCM模型的可识别性,主要关注条件分布的指数族。我们还提出了一种算法,用于从随机样本估计CPCM下的因果结构。对不同的数据集进行了研究,包括一个关于菲律宾居民支出行为的应用。