Nowadays, malware campaigns have reached a high level of sophistication, thanks to the use of cryptography and covert communication channels over traditional protocols and services. In this regard, a typical approach to evade botnet identification and takedown mechanisms is the use of domain fluxing through the use of Domain Generation Algorithms (DGAs). These algorithms produce an overwhelming amount of domain names that the infected device tries to communicate with to find the Command and Control server, yet only a small fragment of them is actually registered. Due to the high number of domain names, the blacklisting approach is rendered useless. Therefore, the botmaster may pivot the control dynamically and hinder botnet detection mechanisms. To counter this problem, many security mechanisms result in solutions that try to identify domains from a DGA based on the randomness of their name. In this work, we explore hard to detect families of DGAs, as they are constructed to bypass these mechanisms. More precisely, they are based on the use of dictionaries so the domains seem to be user-generated. Therefore, the corresponding generated domains pass many filters that look for, e.g. high entropy strings. To address this challenge, we propose an accurate and efficient probabilistic approach to detect them. We test and validate the proposed solution through extensive experiments with a sound dataset containing all the wordlist-based DGA families that exhibit this behaviour and compare it with other state-of-the-art methods, practically showing the efficacy and prevalence of our proposal.
翻译:目前,恶意软件运动已经达到高度精密程度,这归功于对传统协议和服务的加密和秘密通信渠道的使用。在这方面,逃避肉网识别和吞没机制的典型做法是使用Domain Generation Algorithms(DGAs),使用Domain DGA(DGA)来查找域流。这些算法产生了大量域名,被感染的装置试图与之沟通以寻找指挥和控制服务器,但实际上只登记了其中一小部分域名。由于域名数量众多,黑名单办法变得毫无用处。因此,机器人技术主管可能会动态地将控制效率与肉网检测机制连接起来,从而阻碍肉网检测机制。为了解决这一问题,许多安全机制根据DGA(D)的随机性,试图从DGA(D)中找出域域名。在建立DGA(D)组时,我们很难发现这些域名的家属,因为它们是用来绕过这些机制的。更确切的域名,因此,黑名单方法变得毫无用处。因此,相应的域主机主可以通过许多过滤器, 来比较这个精确的图像,例如高端试验,我们测试系统测试的系统测试,我们用这个测试的系统, 来提出一个声音测试。