ML-based Phishing URL (MLPU) detectors serve as the first level of defence to protect users and organisations from being victims of phishing attacks. Lately, few studies have launched successful adversarial attacks against specific MLPU detectors raising questions about their practical reliability and usage. Nevertheless, the robustness of these systems has not been extensively investigated. Therefore, the security vulnerabilities of these systems, in general, remain primarily unknown which calls for testing the robustness of these systems. In this article, we have proposed a methodology to investigate the reliability and robustness of 50 representative state-of-the-art MLPU models. Firstly, we have proposed a cost-effective Adversarial URL generator URLBUG that created an Adversarial URL dataset. Subsequently, we reproduced 50 MLPU (traditional ML and Deep learning) systems and recorded their baseline performance. Lastly, we tested the considered MLPU systems on Adversarial Dataset and analyzed their robustness and reliability using box plots and heat maps. Our results showed that the generated adversarial URLs have valid syntax and can be registered at a median annual price of \$11.99. Out of 13\% of the already registered adversarial URLs, 63.94\% were used for malicious purposes. Moreover, the considered MLPU models Matthew Correlation Coefficient (MCC) dropped from a median 0.92 to 0.02 when tested against $Adv_\mathrm{data}$, indicating that the baseline MLPU models are unreliable in their current form. Further, our findings identified several security vulnerabilities of these systems and provided future directions for researchers to design dependable and secure MLPU systems.
翻译:以 MLL 为基础的 Phishing URL (MLPU) 检测器是保护用户和组织免遭钓鱼攻击的首级防御系统,最近,很少有研究成功地对特定的 MLPU 检测器发动了具有成本效益的对抗性攻击,使人们对其实际可靠性和使用情况产生疑问,然而,这些系统的坚固性尚未受到广泛调查,因此,这些系统的安全性弱点总的来说仍然基本上未知,需要测试这些系统的稳健性。在本篇文章中,我们提出了一个方法,用以调查50个具有代表性的先进MLPU 模型的可靠性和可靠性。首先,我们提出了具有成本效益的Aversarial URLURBUG对特定的 MLPU探测器发动了对抗性攻击,导致产生了一个Adversarial URBU数据集。随后,我们复制了50 MLPU(传统ML 和深层学习) 系统,并记录了它们的基线性性工作。最后,我们在Aversarial数据集中测试了经过测试的MSU 和可靠性。 我们的当前测试的URL 提供了有效的 Synal URL 。