Federated Learning (FL) is a well-established technique for privacy preserving distributed training. Much attention has been given to various aspects of FL training. A growing number of applications that consume FL-trained models, however, increasingly operate under dynamically and unpredictably variable conditions, rendering a single model insufficient. We argue for training a global family of models cost efficiently in a federated fashion. Training them independently for different tradeoff points incurs $O(k)$ cost for any k architectures of interest, however. Straightforward applications of FL techniques to recent weight-shared training approaches is either infeasible or prohibitively expensive. We propose SuperFed - an architectural framework that incurs $O(1)$ cost to co-train a large family of models in a federated fashion by leveraging weight-shared learning. We achieve an order of magnitude cost savings on both communication and computation by proposing two novel training mechanisms: (a) distribution of weight-shared models to federated clients, (b) central aggregation of arbitrarily overlapping weight-shared model parameters. The combination of these mechanisms is shown to reach an order of magnitude (9.43x) reduction in computation and communication cost for training a $5*10^{18}$-sized family of models, compared to independently training as few as $k = 9$ DNNs without any accuracy loss.
翻译:联邦学习联合会(FL)是维护隐私的成熟技术,是分散培训的既定方法,对FL培训的各个方面给予了很大的注意。越来越多的应用软件消耗FL培训模式,但越来越多的应用软件消耗FL培训模式,但越来越多的FL培训模式在动态和难以预测的可变条件下日益在动态和难以预测的环境下运作,使单一模式不足。我们主张以联手方式以高效的方式培训全球模型大家庭的成本;然而,为不同交易点独立培训它们需要为任何感兴趣的 k 结构支付O(k)美元的费用。将FL技术直接应用到最近共享的加权培训方法,要么不可行,要么费用过高。我们提议采用SUperFed(SUperFed)――一个建筑框架,通过利用权重分担的学习,使一大批模型以联手方式共同培训,花费了1美元的费用。我们通过提议两个新的培训机制,在通信和计算和计算两方面实现一定规模的成本节约:(a) 将权重分摊模式分配给任何受封客户,(b) 任意重叠的权重分摊模式集中使用,要么不可行,要么费用昂贵;我们提议,在计算和通信成本方面削减为510美元的计算成本为5-10美元。