Deep convolutional neural networks (CNNs) can be applied to malware binary detection via image classification. The performance, however, is degraded due to the imbalance of malware families (classes). To mitigate this issue, we propose a simple yet effective weighted softmax loss which can be employed as the final layer of deep CNNs. The original softmax loss is weighted, and the weight value can be determined according to class size. A scaling parameter is also included in computing the weight. Proper selection of this parameter is studied and an empirical option is suggested. The weighted loss aims at alleviating the impact of data imbalance in an end-to-end learning fashion. To validate the efficacy, we deploy the proposed weighted loss in a pre-trained deep CNN model and fine-tune it to achieve promising results on malware images classification. Extensive experiments also demonstrate that the new loss function can well fit other typical CNNs, yielding an improved classification performance.
翻译:深相神经网络(CNNs) 可用于通过图像分类对恶意软件进行二进制检测。 但是,由于恶意软件家庭(类)的不平衡,性能已经退化。 为了缓解这一问题,我们建议了简单而有效的加权软体损失,可以用作深重CNN的最后一层。原始软体损失是加权的,重量值可以根据级别大小确定。计算重量时还包含一个缩放参数。研究适当选择该参数,并提出一个经验选项。加权损失的目的是减轻数据不平衡在端到端学习时的影响。为了验证效果,我们将拟议的加权损失放在预先培训的深重CNN模型中,并微调它,以便在恶意图像分类上取得有希望的结果。广泛的实验还表明,新的损失功能可以与其他典型CNN相适应,从而产生更好的分类性能。