When performing Bayesian computations in practice, one is often faced with the challenge that the constituent model components and/or the data are only available in a distributed fashion, e.g. due to privacy concerns or sheer volume. While various methods have been proposed for performing posterior inference in such federated settings, these either make very strong assumptions on the data and/or model or otherwise introduce significant bias when the local posteriors are combined to form an approximation of the target posterior. By leveraging recently developed methods for Markov Chain Monte Carlo (MCMC) based on Piecewise Deterministic Markov Processes (PDMPs), we develop a computation -- and communication -- efficient family of posterior inference algorithms (Fed-PDMC) which provides asymptotically exact approximations of the full posterior over a large class of Bayesian models, allowing heterogenous model and data contributions from each client. We show that communication between clients and the server preserves the privacy of the individual data sources by establishing differential privacy guarantees. We quantify the performance of Fed-PDMC over a class of illustrative analytical case-studies and demonstrate its efficacy on a number of synthetic examples along with realistic Bayesian computation benchmarks.
翻译:当在实践中进行贝叶斯计算时,人们往往面临挑战,即组成模型组成部分和(或)数据只能以分布式方式提供,例如,由于隐私问题或纯粹体积的原因;虽然提出了在这种联邦环境中进行后方推断的各种方法,但这些方法或者对数据和(或)模型作出非常强烈的假设,或者当当地后方结合形成目标后方近似时,引入重大偏差;我们利用最近开发的基于Pagewith condministic Markov Process(PDMPs)的Markov Cain Conte Carlo(MCMC)方法,我们开发了一个计算 -- -- 和通信 -- -- 高效的后方推推算算算算算算法(Fed-PDMC)组合,该组合为大量巴伊西亚模型上的全部远端外推法精确的近似值,允许每个客户的异质模型和数据贡献。我们表明,客户与服务器之间的通信通过建立不同的隐私保障来维护个人数据源的隐私。我们用一个量化美德-PDMC在一系列实际分析基准的合成分析基准数和展示。