In this paper, we propose and study a distributed and secure algorithm for computing dominant (or truncated) singular value decompositions (SVD) of large and distributed data matrices. We consider the scenario where each node privately holds a subset of columns and only exchanges ''safe'' information with other nodes in a collaborative effort to calculate a dominant SVD for the whole matrix. In the framework of alternating direction methods of multipliers (ADMM), we propose a novel formulation for building consensus by equalizing subspaces spanned by splitting variables instead of equalizing themselves. This technique greatly relaxes feasibility restrictions and accelerates convergence significantly, while at the same time yielding simple subproblems. We design several algorithmic features, including a low-rank multiplier formula and mechanisms for controlling subproblem solution accuracies, to increase the algorithm's computational efficiency and reduce its communication overhead. More importantly, the possibility appears remote, if possible at all, for a malicious node to uncover the data stored in another node through shared quantities available in our algorithm, which is not the case in existing distributed or parallelized algorithms. We present the convergence analysis results, including a worst-case complexity estimate, and extensive experimental results indicating that the proposed algorithm, while safely guarding data privacy, has a strong potential to deliver a cutting-edge performance, especially when communication costs are high.
翻译:在本文中,我们提出并研究一个分布和安全的算法,用于计算大型和分布式数据矩阵中占主导地位(或缺线)的单值分解。我们考虑了每个节点私人持有一组列,而只与其他节点交换“安全”信息,以协力计算整个矩阵中占主导地位的 SVD 。在乘数交替方向方法(ADMMM)的框架内,我们提出一个新的公式,以通过分裂变量而不是相互均等来平衡分解的子空间,从而建立共识。这一技术大大放松了可行性限制,加快了趋同速度,同时产生了简单的子问题。我们设计了几种算法特征,包括低层次的乘数公式和机制,用于控制子问题解决办法的精度。提高算法的计算效率并减少其通信管理费用。更重要的是,如果可能的话,则似乎遥遥遥遥遥遥无期,通过共享的数量来发现储存在另一个节点中的数据,而目前分布式或平行的计算并不是这种情况,同时产生简单的子问题。我们设计了几种算法的算法特征,而现在的精确性分析则是一种潜在的高层次分析结果,而现在的精确的计算结果,包括了一种最精确的精确的计算结果,而现在的精确的精确的计算结果,在进行中则表明一种最强的计算方法,在进行。