Traditionally, Bayesian network structure learning is often carried out at a central site, in which all data is gathered. However, in practice, data may be distributed across different parties (e.g., companies, devices) who intend to collectively learn a Bayesian network, but are not willing to disclose information related to their data owing to privacy or security concerns. In this work, we present a cross-silo federated learning approach to estimate the structure of Bayesian network from data that is horizontally partitioned across different parties. We develop a distributed structure learning method based on continuous optimization, using the alternating direction method of multipliers (ADMM), such that only the model parameters have to be exchanged during the optimization process. We demonstrate the flexibility of our approach by adopting it for both linear and nonlinear cases. Experimental results on synthetic and real datasets show that it achieves an improved performance over the other methods, especially when there is a relatively large number of clients and each has a limited sample size.
翻译:传统上,巴耶斯网络结构学习往往在一个收集所有数据的中央地点进行,但在实践中,数据可能分布于不同当事方(例如公司、装置)之间,这些当事方(例如公司、装置)打算集体学习巴耶斯网络,但由于隐私或安全考虑而不愿意披露与其数据有关的信息。在这项工作中,我们提出了一个跨孤立的联邦学习方法,用不同当事方横向分割的数据来估计巴耶斯网络的结构。我们开发了一种基于连续优化的分布式结构学习方法,使用乘数交替方向方法(ADMM),因此在优化过程中只能交换模型参数。我们通过在线性和非线性案例中采用模型来展示我们的方法的灵活性。合成和真实数据集的实验结果显示,它与其他方法相比取得了更好的业绩,特别是当客户数量相对较多,而且每个方法的抽样规模有限时。