Mobile malware are malicious programs that target mobile devices. They are an increasing problem, as seen in the rise of detected mobile malware samples per year. The number of active smartphone users is expected to grow, stressing the importance of research on the detection of mobile malware. Detection methods for mobile malware exist but are still limited. In this paper, we provide an overview of the performance of machine learning (ML) techniques to detect malware on Android, without using privileged access. The ML-classifiers use device information such as the CPU usage, battery usage, and memory usage for the detection of 10 subtypes of Mobile Trojans on the Android Operating System (OS). We use a real-life dataset containing device and malware data from 47 users for a year (2016). We examine which features, i.e. aspects, of a device, are most important to monitor to detect (subtypes of) Mobile Trojans. The focus of this paper is on dynamic hardware features. Using these dynamic features we apply state-of-the-art machine learning classifiers: Random Forest, K-Nearest Neighbour, and AdaBoost. We show classification results on different feature sets, making a distinction between global device features, and specific app features. None of the measured feature sets require privileged access. Our results show that the Random Forest classifier performs best as a general malware classifier: across 10 subtypes of Mobile Trojans, it achieves an F1 score of 0.73 with a False Positive Rate (FPR) of 0.009 and a False Negative Rate (FNR) of 0.380. The Random Forest, K-Nearest Neighbours, and AdaBoost classifiers achieve F1 scores above 0.72, an FPR below 0.02 and, an FNR below 0.33, when trained separately to detect each subtype of Mobile Trojans.
翻译:移动恶意软件是针对移动设备的恶意程序。 它们是一个日益严重的问题, 从每年检测到的移动恶意软件样本的增加中可以看出。 活跃的智能手机用户的数量预计将增加, 强调了对移动恶意软件检测研究的重要性。 移动恶意软件的检测方法存在, 但仍然有限 。 在本文中, 我们提供机器学习( ML) 技术的性能概览, 用于检测安纳特的恶意软件, 无需使用特许访问。 ML 分类器使用设备信息, 如 CPU 的使用、 电池的使用和记忆使用等设备信息, 用于检测安纳罗纳操作系统( OS) 10个移动特里亚特型的移动恶意软件。 我们使用真实的数据集, 包含47个用户在2016年( 2016) 中提供的设备和恶意恶意软件数据。 我们检查设备的某些特性, 即, 用于检测( 亚纳特兰地) 移动技术的( 亚特质) 。 本文的焦点是动态硬件特性。 使用这些动态特性, 我们使用最先进的机器学习的分类方法: 兰特森林、 K- Nate- nal- realal- real- realalal 3, 和 Ad- realate lax lax lax lax lax lax lax lax lax lax a lax a lax lax lax a lax a lax a d lax a lax a lax lax lax a lax a lax lax a lax lax a lax a lax a lax lax a lax a lax a lax a lax a lax a lax a lax a lax a lax a lax a lax a lader lader lab lader lax lab lab lader lader lax a lax a lax a lab lader lax a lab lax a lax a lader lader lader lader lader lader lax a lax a ps a d la