While there is considerable effort to identify signaling pathways using linear Gaussian Bayesian networks from data, there is less emphasis of understanding and quantifying conditional densities and probabilities of nodes given its parents from the identifed Bayesian network. Most graphical models for continuous data assume a multivariate Gaussian distribution, which might be too restrictive. We re-analyse data from an experimental setting considered in Sachs et al. (2005) to illustrate the effects of such restrictions. For this we propose a novel non Gaussian nonlinear structural equation model based on vine copulas. In particular the D-vine regression approach of Kraus and Czado (2017) is adapted. We show that this model class is more suited to fit the data than the standard linear structural equation model based on the biological consent graph given in Sachs et al. (2005). The modelling approach also allows to study which pathway edges are supported by the data and which can be removed. For data experiment cd3cd28+aktinhib this approach identified three edges, which are no longer supported by the data. For each of these edges a plausible explanation based on underlying the experimental conditions could be found.
翻译:虽然从数据中用线性高山巴伊西亚网络为识别信号路径做出了大量努力,但从身份不明巴伊西亚网络中,对于了解和量化其父母从身份不明巴伊西亚网络获得的点点的有条件密度和概率不那么强调,因为大多数连续数据的图形模型假设了多变量高萨分布,这种分布可能过于严格。我们从Sachs等人(2005年)考虑的实验环境重新分析数据,以说明这些限制的效果。为此,我们提议了一个新的非高萨非线性结构方程模型,该模型基于葡萄干椰子。特别是Kraus和Czado(2017年)的D-vine回归法(2017年)得到了调整。我们显示,这一模型类比基于Sachs等人(2005年)生物同意图的标准线性结构方程模型更适合数据。模型还允许研究哪些路径边缘得到数据的支持,哪些可以删除。对于数据实验 cd3cd28+aktinhib确定了三种边缘方法,这些边上的数据都不再支持。