Many modern applications collect data that comes in federated spirit, with data kept locally and undisclosed. Till date, most insight into the causal inference requires data to be stored in a central repository. We present a novel framework for causal inference with federated data sources. We assess and integrate local causal effects from different private data sources without centralizing them. Then, the treatment effects on subjects from observational data using a non-parametric reformulation of the classical potential outcomes framework is estimated. We model the potential outcomes as a random function distributed by Gaussian processes, whose defining parameters can be efficiently learned from multiple data sources, respecting privacy constraints. We demonstrate the promise and efficiency of the proposed approach through a set of simulated and real-world benchmark examples.
翻译:许多现代应用软件收集以联合精神产生的数据,并在当地保存和未披露数据。迄今为止,对因果推论的多数深入了解要求将数据储存在一个中央储存库中。我们提出了一个与联合数据源进行因果推论的新框架。我们评估并整合了不同私人数据源的当地因果效应,而没有集中这些数据源。然后,利用对传统潜在结果框架的非参数重新拟订来估计观察数据对主题的处理效果。我们把潜在结果作为由高森进程随机分配的函数进行模拟,其确定参数可以从多种数据源中有效学习,尊重隐私限制。我们通过一套模拟和现实世界基准范例来展示拟议方法的希望和效率。