Numerous malware families rely on domain generation algorithms (DGAs) to establish a connection to their command and control (C2) server. Counteracting DGAs, several machine learning classifiers have been proposed enabling the identification of the DGA that generated a specific domain name and thus triggering targeted remediation measures. However, the proposed state-of-the-art classifiers are based on deep learning models. The black box nature of these makes it difficult to evaluate their reasoning. The resulting lack of confidence makes the utilization of such models impracticable. In this paper, we propose EXPLAIN, a feature-based and contextless DGA multiclass classifier. We comparatively evaluate several combinations of feature sets and hyperparameters for our approach against several state-of-the-art classifiers in a unified setting on the same real-world data. Our classifier achieves competitive results, is real-time capable, and its predictions are easier to trace back to features than the predictions made by the DGA multiclass classifiers proposed in related work.
翻译:许多恶意软件家庭依靠域生成算法(DGAs)来建立与其指挥和控制服务器(C2)的连接。 对抗DGAs, 提议数个机器学习分类, 以便能够识别生成特定域名的DGA, 从而启动有针对性的补救措施。 但是, 拟议的最先进的分类方法是基于深层次的学习模型。 这些模型的黑盒性质使得难以评价其推理。 由此造成的缺乏信心使得无法使用这些模型。 我们在此文件中提议, ExpLAIN, 是一个基于地貌且没有上下文的DGA多级分类器。 我们比较评估了我们方法的若干地物组和超光量参数组合, 以对抗同一现实世界数据统一环境下的一些最先进的分类方法。 我们的分类方法取得了竞争性的结果,具有实时能力,其预测更容易追溯到特征, 而不是DGA多级分类者在相关工作中提出的预测。