A common assumption in causal inference from observational data is the assumption of no hidden confounding. Yet it is, in general, impossible to verify the presence of hidden confounding factors from a single dataset. However, under the assumption of independent causal mechanisms underlying the data generative process, we demonstrate a way to detect unobserved confounders when having multiple observational datasets coming from different environments. We present a theory for testable conditional independencies that are only violated during hidden confounding and examine cases where we break its assumptions: degenerate & dependent mechanisms, and faithfulness violations. Additionally, we propose a procedure to test these independencies and study its empirical finite-sample behavior using simulation studies.
翻译:从观测数据得出的因果关系推论的一个常见假设是假设没有隐藏的混淆,然而,一般来说,无法核实从单一数据集中隐藏的混淆因素的存在,然而,根据数据基因化过程所依据的独立因果机制的假设,我们展示了一种方法,在从不同环境产生多种观测数据集时,可以检测未观察到的混淆。我们提出了一个可测试的有条件的不依赖性理论,这种理论仅在隐藏的混淆过程中被违反,并审查我们打破其假设的案例:退化和依赖性机制,以及违反忠诚性。此外,我们提出一种程序,用模拟研究来测试这些不依赖性并研究其经验性的有限抽样行为。