While causal models are becoming one of the mainstays of machine learning, the problem of uncertainty quantification in causal inference remains challenging. In this paper, we study the causal data fusion problem, where datasets pertaining to multiple causal graphs are combined to estimate the average treatment effect of a target variable. As data arises from multiple sources and can vary in quality and quantity, principled uncertainty quantification becomes essential. To that end, we introduce Bayesian Interventional Mean Processes, a framework which combines ideas from probabilistic integration and kernel mean embeddings to represent interventional distributions in the reproducing kernel Hilbert space, while taking into account the uncertainty within each causal graph. To demonstrate the utility of our uncertainty estimation, we apply our method to the Causal Bayesian Optimisation task and show improvements over state-of-the-art methods.
翻译:虽然因果模型正在成为机器学习的支柱之一,但因果推断中的不确定性量化问题仍然具有挑战性。在本文件中,我们研究了因果数据聚合问题,其中将多个因果图表的数据集合并在一起,以估计目标变量的平均处理效果。由于数据来自多种来源,在质量和数量上各有不同,因此原则性不确定性量化变得至关重要。为此,我们引入了贝叶斯干预平均值进程,这是一个将概率整合和内核内嵌等观点相结合的框架,以代表再生产内核希尔伯特空间的干预分布,同时考虑到每个因果图中的不确定性。为了证明我们不确定性估算的效用,我们将我们的方法应用于Causal Bayesian最佳化任务,并显示对最新方法的改进。