Secure aggregation, which is a core component of federated learning, aggregates locally trained models from distributed users at a central server. The "secure" nature of such aggregation consists of the fact that no information about the local users' data must be leaked to the server except the aggregated local models. In order to guarantee security, some keys may be shared among the users (this is referred to as the key sharing phase). After the key sharing phase, each user masks its trained model which is then sent to the server (this is referred to as the model aggregation phase). This paper follows the information theoretic secure aggregation problem originally formulated by Zhao and Sun, with the objective to characterize the minimum communication cost from the $K$ users in the model aggregation phase. Due to user dropouts, the server may not receive all messages from the users. A secure aggregation schemes should tolerate the dropouts of at most $K-U$ users, where $U$ is a system parameter. The optimal communication cost is characterized by Zhao and Sun, but with the assumption that the keys stored by the users could be any random variables with arbitrary dependency. On the motivation that uncoded groupwise keys are more convenient to be shared and could be used in large range of applications besides federated learning, in this paper we add one constraint into the above problem, that the key variables are mutually independent and each key is shared by a group of at most $S$ users, where $S$ is another system parameter. To the best of our knowledge, all existing secure aggregation schemes assign coded keys to the users. We show that if $S > K - U$, a new secure aggregation scheme with uncoded groupwise keys can achieve the same optimal communication cost as the best scheme with coded keys; if $S \leq K - U$, uncoded groupwise key sharing is strictly sub-optimal.
翻译:安全聚合是联合学习的核心组成部分之一,它集中了中央服务器上分布用户的当地培训模型。这种聚合的“安全”性质包括,除了综合本地模型之外,不得将当地用户数据的信息泄露给服务器。为了保证安全,有些密钥可以由用户共享(这被称为关键共享阶段)。在关键共享阶段之后,每个用户都隐藏其经过培训的模型,然后发送到服务器(这被称为模型汇总阶段)。本文遵循了最初由赵和孙开发的信息理论安全聚合问题。这种聚合的“安全”性质包括:在模型汇总阶段,不得将当地用户的数据透露给服务器,但除了综合模型的本地模型之外,不得将当地用户的数据泄露到服务器上。为了保证安全,服务器可能不会从用户那里接收所有信息。为了保证安全,一个安全的组合计划应该容忍最多为$-U的用户的辍学,其中的美元是一个系统参数。如果Zhaoo和S太阳的优化通信成本,但假设用户储存的钥匙可能是任意依赖的任何随机变量。对于最不理解的用户来说,在最大的关键变量中,在使用一个不加密的组合中,最容易使用一个关键的基键的组合中,将显示我们现有的关键组合的组合的策略的系统系统系统是安全的系统,使用。