使用未编码组密钥的信息理论安全聚合 (On the Information Theoretic Secure Aggregation with Uncoded Groupwise Keys)

Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The ``secure'' nature of such aggregation consists of the fact that no information about the local users' data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users. After the key sharing phase, each user masks its trained model which is then sent to the server. This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the K users in the model aggregation phase. Due to user dropouts, the server may not receive all messages from the users. A secure aggregation scheme should tolerate the dropouts of at most $K-U$ users. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we assume the key variables are mutually independent and each key is shared by a group of S users. To the best of our knowledge, all existing secure aggregation schemes assign coded keys to the users. We show that if $S> K-U$, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if $S \leq K-U$, uncoded groupwise key sharing is strictly sub-optimal. Finally, we also implement our proposed secure aggregation scheme into Amazon EC2, which are then compared with the existing secure aggregation schemes with offline key sharing.

翻译：安全聚合是联合学习的核心组成部分, 集合中央服务器上分布用户的本地培训模型。 “ 安全” 集合的性质包括, 除了综合本地模型之外, 有关本地用户数据的信息不得泄露给服务器。为了保证安全, 一些密钥可以在用户之间共享。在关键共享阶段之后, 每个用户都隐藏其经过培训的模型, 然后发送到服务器。本文遵循了 Zhao 和 Sun 最初开发的信息理论安全汇总问题, 目的是描述模型汇总阶段 K用户的最低通信费用。由于用户辍学, 服务器可能无法接收用户的所有信息。一个安全的汇总方案应该容忍最多为美元用户的退出服务器。为了保证安全, 最佳的通信费用可以由Zhao 和 Sun 来描述。但假设用户储存的密钥可能是任意依赖的任意变量。关于未编码的组合键可以更方便地共享, 并且可以使用大范围的应用程序, 而不是以Federric 来学习。由于用户的退出, 用户不会接收所有 K- U- 最安全的 K- 共享的 K- 系统, 我们最后的 K- 将每个关键的 K- 共享的 K- 共享的 K- 系统以我们最独立的 K- 共享的 K- 的 K- 的 K- 的组合的 K- 的 K- 的 K- 共享的 K- 的 K- 的 K- 将显示的组合的组合的组合的组合的组合的系统显示- 将显示每个 K- 的组合的组合的组合的组合的组合的系统进行不使用最独立的 K- 的 K- 将显示到最独立的 K- 的 K- 的组合的 K- 的 K- 的 K- 显示的是, 我们的组合的组合的组合的组合的组合的组合的组合的组合的系统将所有 K- 将所有 K- 的组合的组合的组合的组合的系统将显示到最不独立的组合的 K- K- 将显示- 将所有 K- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- k- 的组合的组合的系统- 将所有 K- d) 将所有 K-