Federated learning (FL) involves multiple distributed devices jointly training a shared model without any of the participants having to reveal their local data to a centralized server. Most of previous FL approaches assume that data on devices are fixed and stationary during the training process. However, this assumption is unrealistic because these devices usually have varying sampling rates and different system configurations. In addition, the underlying distribution of the device data can change dynamically over time, which is known as concept drift. Concept drift makes the learning process complicated because of the inconsistency between existing and upcoming data. Traditional concept drift handling techniques such as chunk based and ensemble learning-based methods are not suitable in the federated learning frameworks due to the heterogeneity of local devices. We propose a novel approach, FedConD, to detect and deal with the concept drift on local devices and minimize the effect on the performance of models in asynchronous FL. The drift detection strategy is based on an adaptive mechanism which uses the historical performance of the local models. The drift adaptation is realized by adjusting the regularization parameter of objective function on each local device. Additionally, we design a communication strategy on the server side to select local updates in a prudent fashion and speed up model convergence. Experimental evaluations on three evolving data streams and two image datasets show that \model~detects and handles concept drift, and also reduces the overall communication cost compared to other baseline methods.
翻译:联邦学习系统(FL)涉及多个分布式设备,共同培训一个共享模型,任何参与者都不必向中央服务器披露其本地数据,而无需向中央服务器披露其本地数据。以往的FL方法大多假定设备数据在培训过程中是固定的和固定的。然而,这一假设是不现实的,因为这些设备通常具有不同的抽样率和不同的系统配置。此外,设备数据的基本分布可随着时间变化动态变化,称为概念漂移。概念漂移使学习过程变得复杂,因为现有和即将到来的数据之间不一致。传统的概念漂移处理技术,如基于块和基于共同学习的方法等传统概念的漂移处理技术,由于当地设备的多样性,不适合在联合学习框架中。我们提议一种新颖的方法,即FedConD,以探测和处理本地设备的概念漂移,并尽量减少对不连续的FLL的模型性能的影响。漂移探测战略以适应机制为基础,利用当地模型的历史性能。通过调整每个地方设备的目标功能的正规化参数实现漂移适应。此外,我们还在服务器上设计了一个通信模型方面的通信战略,在两个方向上选择稳妥的基模型,并选择了当地模型模型格式上的最新数据处理方法。