In this paper, we develop an algorithm for federated principal component analysis (PCA) with emphases on both communication efficiency and data privacy. Generally speaking, federated PCA algorithms based on direct adaptations of classic iterative methods, such as simultaneous subspace iterations (SSI), are unable to preserve data privacy, while algorithms based on variable-splitting and consensus-seeking, such as alternating direction methods of multipliers (ADMM), lack in communication-efficiency. In this work, we propose a novel consensus-seeking formulation by equalizing subspaces spanned by splitting variables instead of equalizing variables themselves, thus greatly relaxing feasibility restrictions and allowing much faster convergence. Then we develop an ADMM-like algorithm with several special features to make it practically efficient, including a low-rank multiplier formula and techniques for treating subproblems. We establish that the proposed algorithm can better protect data privacy than classic methods adapted to the federated PCA setting. We derive convergence results, including a worst-case complexity estimate, for the proposed ADMM-like algorithm in the presence of the nonlinear equality constraints. Extensive empirical results are presented to show that the new algorithm, while enhancing data privacy, requires far fewer rounds of communication than existing peer algorithms for federated PCA.
翻译:在本文中,我们为联合主要组成部分分析制定了一种算法,重点是通信效率和数据隐私。一般而言,基于对传统迭代方法(如同时的子空间迭代)的直接调整的联盟式常设仲裁院算法无法保护数据隐私,而基于可变分解和寻求共识的算法,如交替的乘数方向方法(ADMM),缺乏通信效率。在这项工作中,我们提出一种新的寻求共识的提法,将以分裂变数而不是等同变数本身来对等的子空间进行平衡,从而大大放宽可行性限制,并允许更快的趋同。然后,我们开发一种类似于ADMMM的算法,具有若干特点,使其实际有效,包括低层次的乘数公式和处理子问题的技巧。我们确定,拟议的算法比适应已加热的常设仲裁院设置的经典方法(ADMM)更能保护数据隐私。我们提出了趋同的结果,包括最坏的复杂程度的估计,因为拟议的ADMMM的算法在非线性平等制约下,因此大大放宽了可行性限制,因此提出广泛的实验性结果,而需显示现有的同级分析。