We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to compute approximate bounds for partially identifiable counterfactual queries, which are the focus of this paper. We then show how the same approach can solve the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remapping it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness and accuracy of our approach, while hinting at the benefits of integrating heterogeneous data to get informative bounds in case of partial identifiability.
翻译:我们处理将多重、可能偏向、观察和干预研究的数据综合起来,最终在结构性因果模型中计算反事实数据的问题。我们从一个受选择偏差影响的单一观察数据集开始。我们显示,现有数据的可能性没有本地最大值。这使我们能够利用因果预期-最大化计划来计算部分可识别的反事实查询的近似界限,这是本文件的重点。然后,我们表明同样的方法如何通过图形转换将它重新映射为以前的数据集,从而解决多数据集的普通案例,无论是干预数据集还是观察数据集,有偏向的数据集或不带偏见的数据集。系统的数字实验和关于缓和护理的案例研究显示了我们的方法的有效性和准确性,同时暗示了在部分可识别性的情况下将不同数据整合到信息界限的好处。</s>