拦截海冰海德拉:实时探测生成的算法域域 (Intercepting Hail Hydra: Real-Time Detection of Algorithmically Generated Domains)

A crucial technical challenge for cybercriminals is to keep control over the potentially millions of infected devices that build up their botnets, without compromising the robustness of their attacks. A single, fixed C&C server, for example, can be trivially detected either by binary or traffic analysis and immediately sink-holed or taken-down by security researchers or law enforcement. Botnets often use Domain Generation Algorithms (DGAs), primarily to evade take-down attempts. DGAs can enlarge the lifespan of a malware campaign, thus potentially enhancing its profitability. They can also contribute to hindering attack accountability. In this work, we introduce HYDRAS, the most comprehensive and representative dataset of Algorithmically-Generated Domains (AGD) available to date. The dataset contains more than 100 DGA families, including both real-world and adversarially designed ones. We analyse the dataset and discuss the possibility of differentiating between benign requests (to real domains) and malicious ones (to AGDs) in real-time. The simultaneous study of so many families and variants introduces several challenges; nonetheless, it alleviates biases found in previous literature employing small datasets which are frequently overfitted, exploiting characteristic features of particular families that do not generalise well.We thoroughly compare our approach with the current state-of-the-art and highlight some methodological shortcomings in the actual state of practice. The outcomes obtained show that our proposed approach significantly outperforms the current state-of-the-art in terms of both classification performance and efficiency.

翻译：对网络罪犯来说,一个关键的技术挑战是在不损害攻击力度的情况下,控制可能数以百万计的被感染装置,这些装置将建立其肉网,而不会损害其攻击的稳健性能。例如,一个单一的、固定的C&C服务器可以通过二进制或交通分析得到微不足道的检测,并立即被安全研究人员或执法部门拆掉。Botnets经常使用Domain General Algorithms(DGAs),主要是为了躲避攻占企图。DGAs可以扩大恶意软件运动的寿命,从而有可能提高它的盈利能力。它们还有助于阻碍攻击的问责制。在这项工作中,我们引入了HydradDRAAS,这是迄今可用的Algorithmed Genered Domains(AGD)最全面和最具代表性的数据集。该数据集包含100多个DGA家庭,包括真实世界和对抗性设计的家庭。我们分析了数据集,并讨论了实时区分善意请求(对真实域域)和恶意请求(对标签)的可能性。同时进行的研究还有助于阻碍对攻击的问责。我们许多家庭和变体的分类,但许多家庭和变体的当前分析却却却却却却却在以往的特征上展示中发现了某些的特征上都大大地利用了我们以往的特征分析。