We consider modeling a binary response variable together with a set of covariates for two groups under observational data. The grouping variable can be the confounding variable (the common cause of treatment and outcome), gender, case/control, ethnicity, etc. Given the covariates and a binary latent variable, the goal is to construct two directed acyclic graphs (DAGs), while sharing some common parameters. The set of nodes, which represent the variables, are the same for both groups but the directed edges between nodes, which represent the causal relationships between the variables, can be potentially different. For each group, we also estimate the effect size for each node. We assume that each group follows a Gaussian distribution under its DAG. Given the parent nodes, the joint distribution of DAG is conditionally independent due to the Markov property of DAGs. We introduce the concept of Gaussian DAG-probit model under two groups and hence doubly Gaussian DAG-probit model. To estimate the skeleton of the DAGs and the model parameters, we took samples from the posterior distribution of doubly Gaussian DAG-probit model via MCMC method. We validated the proposed method using a comprehensive simulation experiment and applied it on two real datasets. Furthermore, we validated the results of the real data analysis using well-known experimental studies to show the value of the proposed grouping variable in the causality domain.
翻译:我们考虑在观测数据下,将二元响应变量和一组协变量建模为两组。分组变量可以是混杂变量(处理和结果的共同因素)、性别、病例/对照、族裔等。给定协变量和二元潜变量,目标是构建两个有向无环图(DAG),而共享某些公共参数。表示变量的节点集对于两组是相同的,但是表示变量之间的有向边,即表示变量之间的因果关系可能是不同的。对于每个组,我们还估计了每个节点的效应大小。我们假设每个组都在其DAG下遵循高斯分布。给定父节点,由于DAG的马尔可夫性质,DAG的联合分布是条件独立的。我们在两组下引入了高斯DAG-probit模型的概念,从而得到了双高斯DAG-probit模型。为了估计DAG的骨架和模型参数,我们通过MCMC方法从双高斯DAG-probit模型的后验分布中抽取样本。我们使用了全面的模拟实验验证了所提出的方法,并在两个真实数据集上应用它。此外,我们利用著名的实验研究验证了真实数据分析结果中所提出的分组变量在因果领域中的价值。