Probabilities of Causation play a fundamental role in decision making in law, health care and public policy. Nevertheless, their point identification is challenging, requiring strong assumptions such as monotonicity. In the absence of such assumptions, existing work requires multiple observations of datasets that contain the same treatment and outcome variables, in order to establish bounds on these probabilities. However, in many clinical trials and public policy evaluation cases, there exist independent datasets that examine the effect of a different treatment each on the same outcome variable. Here, we outline how to significantly tighten existing bounds on the probabilities of causation, by imposing counterfactual consistency between SCMs constructed from such independent datasets ('causal marginal problem'). Next, we describe a new information theoretic approach on falsification of counterfactual probabilities, using conditional mutual information to quantify counterfactual influence. The latter generalises to arbitrary discrete variables and number of treatments, and renders the causal marginal problem more interpretable. Since the question of 'tight enough' is left to the user, we provide an additional method of inference when the bounds are unsatisfactory: A maximum entropy based method that defines a metric for the space of plausible SCMs and proposes the entropy maximising SCM for inferring counterfactuals in the absence of more information.
翻译:因果关系的概率在法律、医疗保健和公共政策决策中扮演着重要的角色。然而,它们的点估计是具有挑战性的,需要强大的假设,如单调性。在没有这些假设的情况下,现有的工作需要包含相同处理和结果变量的数据集的多个观察,以便在这些概率上建立约束。然而,在许多临床试验和公共政策评估案例中,存在独立的数据集,这些数据集分别研究了对同一结果变量的不同处理的影响。在这里,我们概述了如何通过对这些独立数据集构建的SCM之间的反事实一致性进行限制(“因果边际问题”),从而显着缩小现有的概率约束。接下来,我们描述了一种新的信息理论方法来验证反事实概率,使用条件相互信息来量化反事实影响。后者适用于任意离散型变量和处理的数量,并使因果边际问题更易解释。由于“足够紧”的问题留给用户,因此我们在约束不令人满意时提供了一个额外的推断方法:基于最大熵的方法,为合理的SCM空间定义了一个度量,为在没有更多信息的情况下推断反事实提供熵最大的SCM。