An important problem in causal inference is to break down the total effect of a treatment on an outcome into different causal pathways and to quantify the causal effect in each pathway. For instance, in causal fairness, the total effect of being a male employee (i.e., treatment) constitutes its direct effect on annual income (i.e., outcome) and the indirect effect via the employee's occupation (i.e., mediator). Causal mediation analysis (CMA) is a formal statistical framework commonly used to reveal such underlying causal mechanisms. One major challenge of CMA in observational studies is handling confounders, variables that cause spurious causal relationships among treatment, mediator, and outcome. Conventional methods assume sequential ignorability that implies all confounders can be measured, which is often unverifiable in practice. This work aims to circumvent the stringent sequential ignorability assumptions and consider hidden confounders. Drawing upon proxy strategies and recent advances in deep learning, we propose to simultaneously uncover the latent variables that characterize hidden confounders and estimate the causal effects. Empirical evaluations using both synthetic and semi-synthetic datasets validate the effectiveness of the proposed method. We further show the potentials of our approach for causal fairness analysis.
翻译:因果关系推论的一个重要问题是将治疗结果对结果的总体影响分成不同的因果关系路径,并量化每种途径的因果关系。例如,在因果关系方面,作为男性雇员(即待遇)的总体影响直接影响到年收入(即结果)和雇员职业(即调解人)造成的间接影响。 因果关系调解分析(CMA)是一个正式的统计框架,通常用来揭示这种根本因果关系机制。CMA在观察研究中的一个主要挑战是处理混淆者,这些变数导致治疗、调解员和结果之间产生虚假的因果关系。常规方法假定顺序忽略,意味着所有混结者都可计量,在实践中往往无法核实。这项工作的目的是绕过严格的顺序忽略假设,考虑隐蔽的混结者。根据代理战略和最近深层学习的进展,我们提议同时发现隐蔽的同心和估计因果关系的潜在变量。使用合成和半合成数据分析两种方法进行的评估证实了拟议方法的公正性。我们进一步展示了对因果关系分析的潜在可能性。