The uses of Machine Learning (ML) in detection of network attacks have been effective when designed and evaluated in a single organisation. However, it has been very challenging to design an ML-based detection system by utilising heterogeneous network data samples originating from several sources. This is mainly due to privacy concerns and the lack of a universal format of datasets. In this paper, we propose a collaborative federated learning scheme to address these issues. The proposed framework allows multiple organisations to join forces in the design, training, and evaluation of a robust ML-based network intrusion detection system. The threat intelligence scheme utilises two critical aspects for its application; the availability of network data traffic in a common format to allow for the extraction of meaningful patterns across data sources. Secondly, the adoption of a federated learning mechanism to avoid the necessity of sharing sensitive users' information between organisations. As a result, each organisation benefits from other organisations cyber threat intelligence while maintaining the privacy of its data internally. The model is trained locally and only the updated weights are shared with the remaining participants in the federated averaging process. The framework has been designed and evaluated in this paper by using two key datasets in a NetFlow format known as NF-UNSW-NB15-v2 and NF-BoT-IoT-v2. Two other common scenarios are considered in the evaluation process; a centralised training method where the local data samples are shared with other organisations and a localised training method where no threat intelligence is shared. The results demonstrate the efficiency and effectiveness of the proposed framework by designing a universal ML model effectively classifying benign and intrusive traffic originating from multiple organisations without the need for local data exchange.
翻译:在一个组织内设计和评价机械学习(ML)在探测网络袭击时,在设计和评价一个强有力的ML网络入侵探测系统时,使用机械学习(ML)是有效的;然而,通过利用来自若干来源的网络数据样本,设计一个基于ML的检测系统非常具有挑战性;这主要是由于隐私问题和缺乏通用的数据集格式;在本文件中,我们提议了一个合作联合学习计划来解决这些问题;拟议框架允许多个组织联手设计、培训和评价基于ML的网络入侵探测系统;威胁情报计划的应用有两个关键方面;以通用格式提供网络数据流量,以便能够在数据源之间提取有意义的模式;第二,采用一个联合学习机制,避免各组织之间交流敏感的用户信息;因此,每个组织都从其他组织的网络威胁情报中得益,同时维护其数据的隐私;该模型在当地接受培训,只有更新的重量才能与Federeral 平均进程中的其余参与者分享。