Causal discovery aims to learn a causal graph from observational data. To date, most causal discovery methods require data to be stored in a central server. However, data owners gradually refuse to share their personalized data to avoid privacy leakage, making this task more troublesome by cutting off the first step. A puzzle arises: $\textit{how do we infer causal relations from decentralized data?}$ In this paper, with the additive noise model assumption of data, we take the first step in developing a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD), which can learn the causal graph without directly touching local data and naturally handle the data heterogeneity. DS-FCD benefits from a two-level structure of each local model. The first level learns the causal graph and communicates with the server to get model information from other clients, while the second level approximates causal mechanisms and personally updates from its own data to accommodate the data heterogeneity. Moreover, DS-FCD formulates the overall learning task as a continuous optimization problem by taking advantage of an equality acyclicity constraint, which can be naturally solved by gradient descent methods. Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
翻译:原因发现的目的是从观测数据中学习因果图表。 到目前为止,大多数因果发现方法都要求将数据存储在中央服务器中。 然而,数据所有者逐渐拒绝分享个人化数据以避免隐私泄漏,从而通过切断第一步而使这项任务更加麻烦。 出现一个问题: $\ textit{ how do we set explect incains relation from droital data?}} 在本文中,在数据添加噪声模型假设中,我们迈出了第一步,开发了一个基于梯度的学习框架,名为DAG-Shared Fled Causal Discovery(DS-FCD),它可以在不直接接触当地数据的情况下学习因果图,并自然处理数据异性。 DS-FCD从每个本地模型的两层结构中受益。 第一层是因果图表,并与服务器沟通,以便从其他客户获取模型信息,而第二层则是因果机制和个人数据更新,以适应数据繁杂性。 此外,DS-FCD将总体学习任务编成一个持续优化的问题,利用全球平等性、高度验证方法解决了真实的合成周期的基化数据。