关于有和没有RFECV的可解释的机器学习模型的比较性能分析 (A Comparative Performance Analysis of Explainable Machine Learning Models With And Without RFECV Feature Selection Technique Towards Ransomware Classification)

特征选择 · Performer · Machine Learning · Analysis · Learning ·

2022 年 12 月 9 日

A Comparative Performance Analysis of Explainable Machine Learning Models With And Without RFECV Feature Selection Technique Towards Ransomware Classification

翻译：关于有和没有RFECV的可解释的机器学习模型的比较性能分析

Rawshan Ara Mowri,Madhuri Siddula,Kaushik Roy

from arxiv, arXiv admin note: text overlap with arXiv:2210.11235

Ransomware has emerged as one of the major global threats in recent days. The alarming increasing rate of ransomware attacks and new ransomware variants intrigue the researchers in this domain to constantly examine the distinguishing traits of ransomware and refine their detection or classification strategies. Among the broad range of different behavioral characteristics, the trait of Application Programming Interface (API) calls and network behaviors have been widely utilized as differentiating factors for ransomware detection, or classification. Although many of the prior approaches have shown promising results in detecting and classifying ransomware families utilizing these features without applying any feature selection techniques, feature selection, however, is one of the potential steps toward an efficient detection or classification Machine Learning model because it reduces the probability of overfitting by removing redundant data, improves the model's accuracy by eliminating irrelevant features, and therefore reduces training time. There have been a good number of feature selection techniques to date that are being used in different security scenarios to optimize the performance of the Machine Learning models. Hence, the aim of this study is to present the comparative performance analysis of widely utilized Supervised Machine Learning models with and without RFECV feature selection technique towards ransomware classification utilizing the API call and network traffic features. Thereby, this study provides insight into the efficiency of the RFECV feature selection technique in the case of ransomware classification which can be used by peers as a reference for future work in choosing the feature selection technique in this domain.

翻译：近日来,Ransomware(API)电话和网络行为已成为主要的全球威胁之一,尽管在利用这些特征而未应用任何特征选择技术的情况下发现和分类赎金软件的家庭方面出现了惊人的上升速度和新的赎金软件变式,使这一领域的研究人员不断探究赎金软件的区别特征,并完善其探测或分类战略;在各种不同的行为特征中,应用程序程序接口(API)电话和网络行为的特点被广泛用作识别或分类赎金软件的区别要素之一;尽管许多先前的做法在利用这些特征而未应用任何特征选择技术的情况下发现和分类赎金软件家庭方面显示出了令人乐观的结果,但特征选择是朝着高效检测或分类机器学习模式而迈出的潜在步骤之一,因为通过删除冗余数据来降低过度匹配的可能性,通过消除不相干特征来提高模型的准确性,从而缩短培训时间;迄今为止,在不同的安全情景中使用了大量特征选择技术来优化机器学习模型的性能;因此,这项研究的目的是对广泛使用的超级机器学习模型进行比较性能分析,而不用RFV(RECV)特性学习模式选择用于选择赎金软件分类方法,从而在选择成本选择成本选择系统选择系统选择技术的方法进行。

相关内容

特征选择

关注 5931

特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS )，或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化，是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日