Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature representations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. In this work, we propose a differentially private algorithm for performing ICA in a decentralized data setting. Conventional approaches to decentralized differentially private algorithms may introduce too much noise due to the typically small sample sizes at each site. We propose a novel protocol that uses correlated noise to remedy this problem. We show that our algorithm outperforms existing approaches on synthetic and real neuroimaging datasets and demonstrate that it can sometimes reach the same level of utility as the corresponding non-private algorithm. This indicates that it is possible to have meaningful utility while preserving privacy.
翻译:独立部件分析(ICA)等盲人源分离算法在分析神经成像数据时被广泛使用。为了利用较大的样本规模,不同的数据持有人/站点可能希望合作学习特征说明。然而,这类数据集往往对隐私敏感,排除了集中分析,将数据集中到一个站点。在这项工作中,我们提出在分散化的数据环境中执行ICA的有差别的私人算法。分散化的私人算法的常规方法可能会由于每个站点的典型样本规模较小而带来过多的噪音。我们提出了使用相关噪音来纠正这一问题的新协议。我们表明,我们的算法比合成和真实神经成像数据集的现有方法要强,并表明它有时可以达到与相应的非私人算法相同的效用水平。这表明,在保护隐私的同时,有可能有有意义的效用。