Clustered Federated Multitask Learning (CFL) was introduced as an efficient scheme to obtain reliable specialized models when data is imbalanced and distributed in a non-i.i.d. (non-independent and identically distributed) fashion amongst clients. While a similarity measure metric, like the cosine similarity, can be used to endow groups of the client with a specialized model, this process can be arduous as the server should involve all clients in each of the federated learning rounds. Therefore, it is imperative that a subset of clients is selected periodically due to the limited bandwidth and latency constraints at the network edge. To this end, this paper proposes a new client selection algorithm that aims to accelerate the convergence rate for obtaining specialized machine learning models that achieve high test accuracies for all client groups. Specifically, we introduce a client selection approach that leverages the devices' heterogeneity to schedule the clients based on their round latency and exploits the bandwidth reuse for clients that consume more time to update the model. Then, the server performs model averaging and clusters the clients based on predefined thresholds. When a specific cluster reaches a stationary point, the proposed algorithm uses a greedy scheduling algorithm for that group by selecting the clients with less latency to update the model. Extensive experiments show that the proposed approach lowers the training time and accelerates the convergence rate by up to 50% while imbuing each client with a specialized model that is fit for its local data distribution.
翻译:在数据不平衡和在非i.i.d.d.(不独立和同样分布)客户之间分配数据时,采用了类似度量度,如共振相似度,可以用来向客户群体提供专门模型,因为服务器应让所有客户参与每个联合学习周期,因此,这一过程可能很艰巨。因此,由于网络边缘带宽和延缓限制有限,必须定期挑选一组客户,以获得可靠的专门模型,在客户中(不独立和完全分布)获得数据。为此,本文件提出新的客户选择算法,目的是加快获得专门机器学习模型的趋同率,从而对所有客户群体进行高测试。具体地说,我们采用客户选择方法,利用这些装置的异质性来根据客户的周期性安排客户,并利用消耗更多时间的客户的带宽再利用模型来更新模型。然后,服务器根据预先确定的门槛对客户进行平均和组合组合。当特定客户群体在选择50度的快速度上,则用一个具体组群比较低的算法来显示每个客户的升级率时,则用一个更低的组合式算法来显示每个客户的升级率。