The analysis of distributed techniques is often focused upon their efficiency, without considering their robustness (or lack thereof). Such a consideration is particularly important when devices or central servers can fail, which can potentially cripple distributed systems. When such failures arise in wireless communications networks, important services that they use/provide (like anomaly detection) can be left inoperable and can result in a cascade of security problems. In this paper, we present a novel method to address these risks by combining both flat- and star-topologies, combining the performance and reliability benefits of both. We refer to this method as "Tol-FL", due to its increased failure-tolerance as compared to the technique of Federated Learning. Our approach both limits device failure risks while outperforming prior methods by up to 8% in terms of anomaly detection AUROC in a range of realistic settings that consider client as well as server failure, all while reducing communication costs. This performance demonstrates that Tol-FL is a highly suitable method for distributed model training for anomaly detection, especially in the domain of wireless networks.
翻译:无线通讯网络的分布式技术分析通常关注效率,而不考虑它们的鲁棒性。当设备或中央服务器出现故障时,这种考虑尤为重要,因为可能会瘫痪分布式系统。当这样的故障发生在无线通信网络中时,它们所使用/提供的重要服务(如异常检测)可能无法运行,从而导致安全问题的级联出现。本文提出了一种新的方法,通过结合平面和星形拓扑结构来增强它的鲁棒性,同时还可以兼顾性能和可靠性。我们将此方法称为“Tol-FL”,由于其相对于联邦学习技术具有更高的容错性。我们的方法在一系列实际设置中考虑客户端和服务器故障,并在异常检测的AUROC方面优于以前的方法高达8%,同时还减少了通信成本。这种性能证明了Tol-FL是一种高度适用于无线网络异常检测分布式模型培训的方法。