Decentralized and incomplete data sources are prevalent in real-world applications, posing a formidable challenge for causal inference. These sources cannot be consolidated into a single entity owing to privacy constraints, and the presence of missing values within them can potentially introduce bias to the causal estimands. We introduce a new approach for federated causal inference from incomplete data, enabling the estimation of causal effects from multiple decentralized and incomplete data sources. Our approach disentangles the loss function into multiple components, each corresponding to a specific data source with missing values. Our approach accounts for the missing data under the missing at random assumption, while also estimating higher-order statistics of the causal estimands. Our method recovers the conditional distribution of missing confounders given the observed confounders from the decentralized data sources to identify causal effects. Our framework estimates heterogeneous causal effects without the sharing of raw training data among sources, which helps to mitigate privacy risks. The efficacy of our approach is demonstrated through a collection of simulated and real-world instances, illustrating its potential and practicality.
翻译:暂无翻译