This work investigates the possibilities enabled by federated learning concerning IoT malware detection and studies security issues inherent to this new learning paradigm. In this context, a framework that uses federated learning to detect malware affecting IoT devices is presented. N-BaIoT, a dataset modeling network traffic of several real IoT devices while affected by malware, has been used to evaluate the proposed framework. Both supervised and unsupervised federated models (multi-layer perceptron and autoencoder) able to detect malware affecting seen and unseen IoT devices of N-BaIoT have been trained and evaluated. Furthermore, their performance has been compared to two traditional approaches. The first one lets each participant locally train a model using only its own data, while the second consists of making the participants share their data with a central entity in charge of training a global model. This comparison has shown that the use of more diverse and large data, as done in the federated and centralized methods, has a considerable positive impact on the model performance. Besides, the federated models, while preserving the participant's privacy, show similar results as the centralized ones. As an additional contribution and to measure the robustness of the federated approach, an adversarial setup with several malicious participants poisoning the federated model has been considered. The baseline model aggregation averaging step used in most federated learning algorithms appears highly vulnerable to different attacks, even with a single adversary. The performance of other model aggregation functions acting as countermeasures is thus evaluated under the same attack scenarios. These functions provide a significant improvement against malicious participants, but more efforts are still needed to make federated approaches robust.
翻译:这项工作调查了在IOT恶意软件检测和研究这种新的学习模式所固有的安全问题方面进行联合学习的可能性。 在这方面,介绍了一个使用联合学习以发现影响IOT装置的恶意软件的框架。 N-BaIoT是一个数据库模型网络流量的模型,由几个真实的IOT装置组成,同时受到恶意软件的影响,用来评价拟议的框架。由监管和不受监督的联邦模型(多层渗透器和自动编码器)能够发现影响可见和看不见的N-BAIoT IOT装置的恶意软件,已经经过培训和评估。此外,它们的业绩已经与两种传统方法相比较。第一个框架让每个参与者在当地培训一个仅使用自己的数据的模型,第二个模型是让参与者与负责培训全球模型的中央实体分享数据。这一比较表明,使用更多样化和大型的模型(多层渗透器和自动编码)仍然对模型的性能产生相当的积极影响。此外,联邦化模型在运行模型的同时,在维护最脆弱的参与者的运行状态方面,也比较了两种传统的方法,因此,一种类似于中央计算的结果是,一种稳定的计算了一种稳定的计算方法。