The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV, a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV, an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext--ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2-36.6 times as fast as HESV, with a limited loss in the accuracy of calculated SVs.
翻译:Shapley值(SV)是跨边缘联邦学习(cross-silo FL)中公平而原则性的贡献评估度量,其中组织(即客户端)共同在参数服务器的协调下训练预测模型。然而,现有的FL SV计算方法假定服务器可以访问原始FL模型和公共测试数据。考虑到FL模型的隐私攻击日益增多,以及测试数据可能是客户端的私人资产,这可能在实践中不是一个有效的假设。因此,我们研究了安全SV计算问题,以用于跨边缘FL。我们首先提出基于同态加密(HE)的一服务器解决方案HESV,该方案仅用于隐私保护,但效率方面存在限制。为了克服这些限制,我们提出了SecSV,这是一种有效的两服务器协议,具有以下新颖特点。首先,SecSV利用混合隐私保护方案,避免在HE下密文--密文乘法和测试数据之间的乘法,这些乘法非常昂贵。其次,提出了一种有效的安全矩阵乘法方法,用于SecSV。第三,SecSV从策略上识别并跳过一些测试样本,而不会对评估准确性产生明显影响。我们的实验显示,SecSV比HESV快7.2-36.6倍,在计算的SV准确性方面损失有限。