Many cyberattacks start with disseminating phishing URLs. When clicking these phishing URLs, the victim's private information is leaked to the attacker. There have been proposed several machine learning methods to detect phishing URLs. However, it still remains under-explored to detect phishing URLs with evasion, i.e., phishing URLs that pretend to be benign by manipulating patterns. In many cases, the attacker i) reuses prepared phishing web pages because making a completely brand-new set costs non-trivial expenses, ii) prefers hosting companies that do not require private information and are cheaper than others, iii) prefers shared hosting for cost efficiency, and iv) sometimes uses benign domains, IP addresses, and URL string patterns to evade existing detection methods. Inspired by those behavioral characteristics, we present a network-based inference method to accurately detect phishing URLs camouflaged with legitimate patterns, i.e., robust to evasion. In the network approach, a phishing URL will be still identified as phishy even after evasion unless a majority of its neighbors in the network are evaded at the same time. Our method consistently shows better detection performance throughout various experimental tests than state-of-the-art methods, e.g., F-1 of 0.89 for our method vs. 0.84 for the best feature-based method.
翻译:许多网络攻击始于传播phishing URL。 当点击这些phishing URL 时, 受害人的私人信息被泄露给攻击者。 提议了几种机器学习方法来检测phishing URL。 但是, 仍然未得到充分探索, 以规避的方式探测phishing URL, 即假称通过操纵模式来友善的网友。 在许多情况下, 攻击者i) 准备的网友网页被重新利用, 原因是制作了全新的全品牌非三角费用套装, ii) 受害者的个人信息被泄露给攻击者。 有人提议采用几种机器学习方法来检测phishing URLM 。 i) 有人提议使用一些机器共享托管方法来检测这些网友的网友, IP地址和 URLS 字符串模式有时被探索不足, 因为这些行为特征,我们提出了一种基于网络的推导法, 准确检测基于合法模式的网友的网友, e. 坚固的逃税。 在网络方法中, phish e9 AL URL 将继续被确定为我们的邻国 方法, 除非 持续地显示我们大多数的测试方法 。