In the era of a data-driven society with the ubiquity of Internet of Things (IoT) devices storing large amounts of data localized at different places, distributed learning has gained a lot of traction, however, assuming independent and identically distributed data (iid) across the devices. While relaxing this assumption that anyway does not hold in reality due to the heterogeneous nature of devices, federated learning (FL) has emerged as a privacy-preserving solution to train a collaborative model over non-iid data distributed across a massive number of devices. However, the appearance of malicious devices (attackers), who intend to corrupt the FL model, is inevitable due to unrestricted participation. In this work, we aim to identify such attackers and mitigate their impact on the model, essentially under a setting of bidirectional label flipping attacks with collusion. We propose two graph theoretic algorithms, based on Minimum Spanning Tree and k-Densest graph, by leveraging correlations between local models. Our FL model can nullify the influence of attackers even when they are up to 70% of all the clients whereas prior works could not afford more than 50% of clients as attackers. The effectiveness of our algorithms is ascertained through experiments on two benchmark datasets, namely MNIST and Fashion-MNIST, with overwhelming attackers. We establish the superiority of our algorithms over the existing ones using accuracy, attack success rate, and early detection round.
翻译:在以数据驱动的社会时代,由于互联网中存在着大量存储不同地点的大量本地数据的工具(IoT),因此,分散的学习获得了许多牵引力,然而,假设独立和同样分布的数据(iid)在整个设备中。虽然放松了这一假设,即由于装置性质各异,反正在现实中并不存在现实,但联邦学习(FL)已成为一种隐私保护解决方案,用于对在大量设备中分布的非二元数据进行协作模型培训。然而,恶意装置(攻击者)的出现(攻击者)意图腐蚀FL模型,这是不可避免的,因为没有限制的参与。在这项工作中,我们力求识别此类攻击者并减轻其对模型的影响,主要是在双向标签的设置下,以串联的方式翻攻击。我们提议以最小的树和K-Dent图为基础,利用本地模型之间的关联来绘制两个图表理论性算法。我们的FL模型可以抵消攻击者的影响,即使他们已经达到所有客户的70%,而先前的工程的准确性无法承受超过50 %的客户的早期检测率。我们用FMISSA标准来确定我们现有的攻击率数据。