Federated learning (FL), which utilizes communication between the server (core) and local devices (edges) to indirectly learn from more data, is an emerging field in deep learning research. Recently, Knowledge Distillation-based FL methods with notable performance and high applicability have been suggested. In this paper, we choose knowledge distillation-based FL method as our baseline and tackle a challenging problem that ensues from using these methods. Especially, we focus on the problem incurred in the server model that tries to mimic different datasets, each of which is unique to an individual edge device. We dub the problem 'edge bias', which occurs when multiple teacher models trained on different datasets are used individually to distill knowledge. We introduce this nuisance that occurs in certain scenarios of FL, and to alleviate it, we propose a simple yet effective distillation scheme named 'buffered distillation'. In addition, we also experimentally show that this scheme is effective in mitigating the straggler problem caused by delayed edges.
翻译:使用服务器(核心)和本地设备(屏障)之间的通信间接学习更多的数据,联邦学习(FL)利用服务器(核心)和本地设备(屏障)之间的通信间接学习,这是深层学习研究中的一个新兴领域。最近,提出了基于知识蒸馏法,其性能显著且适用性强的建议。在本文中,我们选择基于知识蒸馏法作为我们的基线,并解决使用这些方法所产生的一个具有挑战性的问题。特别是,我们侧重于试图模拟不同数据集的服务器模型中出现的问题,每个数据集都是单个边缘设备独有的。我们将问题“边缘偏差”归为“边缘偏差 ”, 这个问题是当不同数据集培训的多个教师模型被个别用于蒸馏知识时出现的。我们引入了在FL某些情景中出现的这种麻烦,并减轻了它,我们提出了一个简单而有效的蒸馏法,名为“ 缓冲蒸馏法 ” 。 此外,我们还实验性地表明,这个方案在缓解延迟边缘造成的压流压问题方面是有效的。