The cyber-threat landscape has evolved tremendously in recent years, with new threat variants emerging daily, and large-scale coordinated campaigns becoming more prevalent. In this study, we propose CELEST (CollaborativE LEarning for Scalable Threat detection, a federated machine learning framework for global threat detection over HTTP, which is one of the most commonly used protocols for malware dissemination and communication. CELEST leverages federated learning in order to collaboratively train a global model across multiple clients who keep their data locally, thus providing increased privacy and confidentiality assurances. Through a novel active learning component integrated with the federated learning technique, our system continuously discovers and learns the behavior of new, evolving, and globally-coordinated cyber threats. We show that CELEST is able to expose attacks that are largely invisible to individual organizations. For instance, in one challenging attack scenario with data exfiltration malware, the global model achieves a three-fold increase in Precision-Recall AUC compared to the local model. We also design a poisoning detection and mitigation method, DTrust, specifically designed for federated learning in the collaborative threat detection domain. DTrust successfully detects poisoning clients using the feedback from participating clients to investigate and remove them from the training process. We deploy CELEST on two university networks and show that it is able to detect the malicious HTTP communication with high precision and low false positive rates. Furthermore, during its deployment, CELEST detected a set of previously unknown 42 malicious URLs and 20 malicious domains in one day, which were confirmed to be malicious by VirusTotal.
翻译:近年来,网络威胁景观发生了巨大变化,新的威胁变异每天都在出现,大规模协调运动日益普遍。在本研究中,我们提议CELEST(Collaborative Learning Learing for Scalable Dream Setation,这是在HTTP(这是最常用的恶意传播和沟通协议之一)上全球威胁探测的联结机学习框架。CELEST利用联结式学习工具,在多个客户中合作培训全球模型,这些客户将数据保存在本地,从而增加隐私和保密保证。通过结合联合学习技术,我们的系统不断发现和学习新的、不断发展的、全球协调的网络威胁探测行为。我们显示,CELEST能够揭露对单个组织基本看不见的攻击。例如,在一个具有挑战性的进攻性情景中,数据过滤错误回溯回调 AUC(Precricion-Recall AUC)与当地模型相比增加了三倍。我们还设计了一种中毒检测和缓解方法,DL(DR) 专门设计了C(C-DL) 和(C-DTER) 快速地检测到一个风险的客户。