Understanding the dynamic behavior of computer programs during normal working conditions is an important task, which has multiple security benefits such as the development of behavior-based anomaly detection, vulnerability discovery, and patching. Existing works achieved this goal by collecting and analyzing various data including network traffic, system calls, instruction traces, etc. In this paper, we explore the use of a new type of data, performance counters, to analyze the dynamic behavior of programs. Using existing primitives, we develop a tool named perfextract to capture data from different performance counters for a program during its startup time, thus forming multiple time series to represent the dynamic behavior of the program. We analyze the collected data and develop a semi-supervised clustering algorithm that allows us to classify each program using its performance counter time series into a specific group and to identify the intrinsic behavior of that group. We carry out extensive experiments with 18 real-world programs that belong to 4 groups including web browsers, text editors, image viewers, and audio players. The experimental results show that the examined programs can be accurately differentiated based on their performance counter data regardless of whether programs are run in physical or virtual environments.
翻译:了解正常工作条件下计算机程序动态行为是一项重要任务,它具有多种安全效益,例如开发基于行为的异常检测、脆弱性发现和补丁。现有工作通过收集和分析各种数据,包括网络流量、系统电话、指示痕迹等,实现了这一目标。在本文件中,我们探索使用新型数据、性能计数器,分析程序的动态行为。利用现有原始数据,我们开发了一个名为“透视”的工具,用于在程序启动期间从不同性能计数器中获取数据,从而形成多个时间序列来代表程序的动态行为。我们分析了所收集的数据,并开发了一个半监督的群集算法,以使我们能够将每个程序使用性能对应时间序列分为一个特定组,并确定该组的内在行为。我们用属于四个组的18个真实世界程序进行了广泛的实验,其中包括网络浏览器、文字编辑、图像浏览器和音频播放器。实验结果表明,所审查的程序可以根据它们的性能反数据进行准确的区分,而不论程序是在物理或虚拟环境中运行。