In this paper, we propose and study a distributed and secure algorithm for computing dominant (or truncated) singular value decompositions (SVD) of large and distributed data matrices. We consider the scenario where each node privately holds a subset of columns and only exchanges "safe" information with other nodes in a collaborative effort to calculate a dominant SVD for the whole matrix. In the framework of alternating direction methods of multipliers (ADMM), we propose a novel formulation for building consensus by equalizing subspaces spanned by splitting variables instead of equalizing themselves. This technique greatly relaxes feasibility restrictions and accelerates convergence significantly, while at the same time yielding simple subproblems. We design several algorithmic features, including a low-rank multiplier formula and mechanisms for controlling subproblem solution accuracies, to increase the algorithm's computational efficiency and reduce its communication overhead. More importantly, unlike most existing distributed or parallelized algorithms, our algorithm preserves the privacy of locally-held data; that is, none of the nodes can recover the data stored in another node through information exchanged during communications. We present convergence analysis results, including a worst-case complexity estimate, and extensive experimental results indicating that the proposed algorithm, while safely guarding data privacy, has a strong potential to deliver a cutting-edge performance, especially when communication costs are high.
翻译:在本文中,我们提出并研究一个分布和安全的算法,用于计算大型和分布式数据矩阵的主要(或缺线的)单值分解。我们考虑了每个节点私人持有一组分栏和仅与其他节点交换“安全”信息,以共同努力计算整个矩阵的支配性SVD。在乘数的交替方向方法(ADMM)框架内,我们提出一个新的公式,通过将分布式或平行的算法等同起来,使分解变量的子空间相互平衡,从而建立共识。这一技术大大放松了可行性限制,加快了趋同速度,同时产生了简单的子问题。我们设计了几种算法特征,包括低层次的乘数公式和机制,用于控制子问题解决方案的切换,以提高算法的计算效率,减少通信的间接成本。更重要的是,与大多数现有的分布式或平行的算法不同,我们的算法保留了当地持有数据的隐私;也就是说,没有一个节点能够通过另一个节点的交换信息来恢复数据,同时产生简单的子问题。我们设计了几种算法,我们设计了低层次的计算结果,这是为了保证一个可靠的精确性分析。我们提出了一种可靠的数据,在分析中可以保证一个可靠的分析。