Federated learning has emerged as a popular technique for distributing machine learning (ML) model training across the wireless edge. In this paper, we propose two timescale hybrid federated learning (TT-HF), which is a hybrid between the device-to-server communication paradigm in federated learning and device-to-device (D2D) communications for model training. In TT-HF, during each global aggregation interval, devices (i) perform multiple stochastic gradient descent iterations on their individual datasets, and (ii) aperiodically engage in consensus formation of their model parameters through cooperative, distributed D2D communications within local clusters. With a new general definition of gradient diversity, we formally study the convergence behavior of TT-HF, resulting in new convergence bounds for distributed ML. We leverage our convergence bounds to develop an adaptive control algorithm that tunes the step size, D2D communication rounds, and global aggregation period of TT-HF over time to target a sublinear convergence rate of O(1/t) while minimizing network resource utilization. Our subsequent experiments demonstrate that TT-HF significantly outperforms the current art in federated learning in terms of model accuracy and/or network energy consumption in different scenarios where local device datasets exhibit statistical heterogeneity.
翻译:联邦学习已成为在无线边缘传播机器学习(ML)模式培训的一种流行技术。在本文中,我们提议采用两种时间尺度混合混合学习(TT-HF),这是在联合学习和装置对装置对装置的通信模式之间用于示范培训的装置对服务器通信模式的混合。在TT-HF, 在每个全球聚合间隔期间,设备(一) 在其单个数据集上进行多孔相向梯度梯度下沉变,以及(二) 定期通过合作,在地方集群内传播D2D通信,以协商一致的方式形成其模型参数。根据梯度多样性的新一般定义,我们正式研究TT-HF的趋同行为,从而形成分布式ML的新趋同线。我们利用我们的趋同约束,制定适应性控制算法,以调整一步尺寸、D2D通信回合和全球TT-HF的累积期,以便针对O-1/t的亚线下集速度,同时尽量减少网络资源的利用。我们随后的实验表明,TT-HF-HF在不同的统计假设中,以不同的统计模型显示当地能源的准确度,以不同的统计模型/实验方法学习了不同的统计假设。