Federated Learning (FL) is an emerging learning scheme that allows different distributed clients to train deep neural networks together without data sharing. Neural networks have become popular due to their unprecedented success. To the best of our knowledge, the theoretical guarantees of FL concerning neural networks with explicit forms and multi-step updates are unexplored. Nevertheless, training analysis of neural networks in FL is non-trivial for two reasons: first, the objective loss function we are optimizing is non-smooth and non-convex, and second, we are even not updating in the gradient direction. Existing convergence results for gradient descent-based methods heavily rely on the fact that the gradient direction is used for updating. This paper presents a new class of convergence analysis for FL, Federated Learning Neural Tangent Kernel (FL-NTK), which corresponds to overparamterized ReLU neural networks trained by gradient descent in FL and is inspired by the analysis in Neural Tangent Kernel (NTK). Theoretically, FL-NTK converges to a global-optimal solution at a linear rate with properly tuned learning parameters. Furthermore, with proper distributional assumptions, FL-NTK can also achieve good generalization.
翻译:联邦学习联合会(FL)是一个新兴的学习计划,它使分布不同的不同客户在不分享数据的情况下共同培训深神经网络,神经网络因其前所未有的成功而变得广受欢迎。据我们所知,FL关于具有明确形式和多步更新的神经网络的理论保障是没有探索的。然而,FL神经网络的培训分析是非边际的,因为有两个原因:第一,我们正在优化的客观损失功能是非移动和非凝固,第二,我们甚至没有在梯度方向上更新。基于梯度的下降法方法的现有趋同结果在很大程度上依赖于梯度方向被用于更新这一事实。本文介绍了FL、FFFROTNT(F-NTK)的新型趋同分析,它与FL-NT(F-NTK)受梯梯梯梯梯度的培养、受NTNT(NT)(NT)分析启发的过度的ReLU神经网络匹配。理论上,FL-NTK(FL-NTK)以正确校准的参数以直线速度向全球最佳解决方案融合。