In this study, we address the emerging field of Streaming Federated Learning (SFL) and propose local cache update rules to manage dynamic data distributions and limited cache capacity. Traditional federated learning relies on fixed data sets, whereas in SFL, data is streamed, and its distribution changes over time, leading to discrepancies between the local training dataset and long-term distribution. To mitigate this problem, we propose three local cache update rules - First-In-First-Out (FIFO), Static Ratio Selective Replacement (SRSR), and Dynamic Ratio Selective Replacement (DRSR) - that update the local cache of each client while considering the limited cache capacity. Furthermore, we derive a convergence bound for our proposed SFL algorithm as a function of the distribution discrepancy between the long-term data distribution and the client's local training dataset. We then evaluate our proposed algorithm on two datasets: a network traffic classification dataset and an image classification dataset. Our experimental results demonstrate that our proposed local cache update rules significantly reduce the distribution discrepancy and outperform the baseline methods. Our study advances the field of SFL and provides practical cache management solutions in federated learning.
翻译:在本研究中,我们解决了流式联邦学习(SFL)的新兴领域,并提出了本地缓存更新规则来管理动态数据分布和有限的缓存容量。传统的联邦学习依赖于固定的数据集,而在SFL中,数据是以流的形式传送的,其分布随时间改变,导致本地训练数据集和长期分布之间存在差异。为了缓解这个问题,我们提出了三种本地缓存更新规则——先进先出(FIFO)、静态比例选择替换(SRSR)和动态比例选择替换(DRSR),在考虑有限的缓存容量的情况下更新每个客户端的本地缓存。此外,我们推导了我们所提出的SFL算法的收敛界限,作为长期数据分布与客户端本地训练数据集之间分布差异的函数。然后,我们在两个数据集上评估我们所提出的算法:网络流量分类数据集和图像分类数据集。我们的实验结果表明,我们提出的本地缓存更新规则显著减少了分布差异,并且优于基线方法。我们的研究推进了SFL领域,并提供了联邦学习中实用的缓存管理解决方案。